Get the first character of each word and its position in a sentence / paragraph

I am trying to create a map by taking the first character of each word and its position in a sentence / paragraph. For this I am using a regex pattern. Regex is an expensive operation. Are there any ways to achieve this?

Regular way:

public static void getFirstChar(String paragraph) {
    Pattern pattern = Pattern.compile("(?<=\\b)[a-zA-Z]");
    Map newMap = new HashMap();

    Matcher fit = pattern.matcher(paragraph);
    while (fit.find()) {
        newMap.put((fit.group().toString().charAt(0)), fit.start());
    }
}

      

+2


a source to share


2 answers


You can do your own line scan if you really need to compress every bit of performance:

                 //0123456789012345678901
    String text = "Hello,my name is=Helen";
    Map<Character,Integer> map = new HashMap<Character,Integer>();

    boolean lastIsLetter = false;
    for (int i = 0; i < text.length(); i++) {
        char ch = text.charAt(i);
        boolean currIsLetter = Character.isLetter(ch);
        if (!lastIsLetter && currIsLetter) {
            map.put(ch, i);
        }
        lastIsLetter = currIsLetter;
    }

    System.out.println(map);
    // prints "{n=9, m=6, H=17, i=14}"

      



API references

0


a source


Python:

wmap = {}
prev = 0
for word in "the quick brown fox jumps over the lazy dog".split():
    wmap[word[0]] = prev
    prev += len(word) + 1

print wmap

      



If a letter appears more than once as the first letter of a word, it will appear in the last position. For a list of all positions, change wmap [word [0]] = prev to:

if word[0] in wmap:
    wmap[word[0]].append(prev)
else:
    wmap[word[0]] = [prev]

      

0


a source







All Articles