Get the first character of each word and its position in a sentence / paragraph
I am trying to create a map by taking the first character of each word and its position in a sentence / paragraph. For this I am using a regex pattern. Regex is an expensive operation. Are there any ways to achieve this?
Regular way:
public static void getFirstChar(String paragraph) {
Pattern pattern = Pattern.compile("(?<=\\b)[a-zA-Z]");
Map newMap = new HashMap();
Matcher fit = pattern.matcher(paragraph);
while (fit.find()) {
newMap.put((fit.group().toString().charAt(0)), fit.start());
}
}
+2
a source to share
2 answers
You can do your own line scan if you really need to compress every bit of performance:
//0123456789012345678901
String text = "Hello,my name is=Helen";
Map<Character,Integer> map = new HashMap<Character,Integer>();
boolean lastIsLetter = false;
for (int i = 0; i < text.length(); i++) {
char ch = text.charAt(i);
boolean currIsLetter = Character.isLetter(ch);
if (!lastIsLetter && currIsLetter) {
map.put(ch, i);
}
lastIsLetter = currIsLetter;
}
System.out.println(map);
// prints "{n=9, m=6, H=17, i=14}"
API references
0
a source to share
Python:
wmap = {}
prev = 0
for word in "the quick brown fox jumps over the lazy dog".split():
wmap[word[0]] = prev
prev += len(word) + 1
print wmap
If a letter appears more than once as the first letter of a word, it will appear in the last position. For a list of all positions, change wmap [word [0]] = prev to:
if word[0] in wmap:
wmap[word[0]].append(prev)
else:
wmap[word[0]] = [prev]
0
a source to share