December 9th, 2012, 11:34 AM
Simple Regular Expression searching for all two-letter clusters
Hi, I have a string of random characters represented below:
text = "iemndcidpwpkeejjslsdjnnvdjsskeelxkjwnsnejxx"
I'm looking for a regular expression that will find all the two-letter character clusters within the above string. In other words, my ideal generated list would be as follows:
['ie','em','mn','nd',...] et cetera.
My code so far is this:
However, upon executing this, the resulting list generated is this:
There is no overlap! I'm looking to find a way to include every two-letter character within my Regular Expression.
Any help would be greatly appreciated!
Thanks so much.
December 9th, 2012, 12:11 PM
using a regex to get substrings of a certain length is rather odd. That's not what regular expressions are for.
If Python doesn't have an intelligent method to get consecutive characters, simply use a "for" loop and extract each two-character substring.
December 9th, 2012, 12:19 PM
As Jacques1 said, this isn't a job for regular expressions. Here's a one-liner that does what you want:
That's for Python 2. For Python 3, replace "xrange" with "range".
[text[i:i+2] for i in xrange(len(text)-1)]