An efficient way in Python to add an element to a comma separated string

I'm looking for the most efficient way to add an element to a comma separated string while preserving the alphabetical order for words:

For instance:

string = 'Apples, Bananas, Grapes, Oranges'
addition = 'Cherries'
result = 'Apples, Bananas, Cherries, Grapes, Oranges'

      

Also, a way to do it, but while keeping the IDs:

string = '1:Apples, 4:Bananas, 6:Grapes, 23:Oranges'
addition = '62:Cherries'
result = '1:Apples, 4:Bananas, 62:Cherries, 6:Grapes, 23:Oranges'

      

The sample code is much appreciated. Thank you very much.

+2


a source to share


4 answers


In the first case:

alist = string.split(', ')
result = ', '.join(sorted(alist + [addition]))

      

For the second case:



alist = string.split(', ')
result = ', '.join(sorted(alist + [addition],
                          key=lambda s: s.split(':', 1)[1]))

      

If you have many thousands of items in a list, the first case can show a measurable performance improvement if you are willing to go for the much more complicated bisect.insort ; but that doesn't support it key=

, so the added complication in the second case would be overwhelming and might not even bring you any performance.

The optimizations mentioned in the last paragraphs are only worth considering if the profile of your entire application shows that this operation is a decisive bottleneck for it (and if so, you will get a lot more speed by storing this data structure as a list of words. ', '

- only joins it when necessary, perhaps for inference purposes, instead of splitting and reuniting thousands and thousands of times for extremely long list types where such optimization might be warranted).

+8


a source


Are you sure you should store data as a string?

It might make sense to maintain a set or list (or, in the latter case, a dictionary) and generate a string when you need it. If the data doesn't change very often, cache the row.



With any solution that uses a string as your main data store, you will probably end up creating a temporary list to make it easier to insert an item, so it makes sense to just keep the list.

+3


a source


Here's one way to do what you want:

>>> ", ".join(sorted('Apples, Bananas, Grapes, Oranges'.split(", ") +
...                  ["Cherries"]))
'Apples, Bananas, Cherries, Grapes, Oranges'

      

and "while saving identifiers":

>>> ", ".join(sorted('1:Apples, 4:Bananas, 6:Grapes, 23:Oranges'.split(", ") + 
...                  ["62:Cherries"], key=lambda x: x.split(":")[1]))
'1:Apples, 4:Bananas, 62:Cherries, 6:Grapes, 23:Oranges'

      

I am purposely ignoring the part of the question where you asked for the "most efficient" way to do something. The proof that an algorithm is the most efficient possible approach to a particular problem is an unsolved problem in computer science. It may not be possible to do this at all, and of course there are no modern methods for this.

If you are concerned about efficiency, you should store intermediate data structures, not perform these operations on strings; any string-based operation will waste a lot of time copying memory; you should only convert to and from strings after all processing is complete.

+2


a source


I think a simple solution would be this:

result = string + ',' + addition   

      

0


a source







All Articles