Replace consecutive whitespace with a single space

Sometimes we need to replace consecutive whitespace in a string with a single space. This is a good practice while parsing html files. Let me show you two ways of doing this.

First one is to split the string and join. Here is the code snippet:
>>> s = "a b c d e f"
>>> " ".join(s.split())
'a b c d e f'

You can check more string methods here: http://docs.python.org/release/2.5.2/lib/string-methods.html

Second method is to use regular expression. Here is the code:
>>> import re
>>> s = "a b c d e f"
>>> p = re.compile(r'\s+')
>>> data = p.sub(' ', s)
>>> data
'a b c d e f'

Comments

JustGlowing said…
I prefer to use the replace() method to replace the doubled space with single space:
>>> s = "a b c d e"
>>> s.replace(" "," ")
'a b c d e'
Tamim Shahriar said…
But if there is odd number of spaces between a and b (say a[space][space][space]b) then using replace("[space][space]", "[space]") will leave two spaces between a and b (a[space][space]b).
Sagar said…
You can run multiple iterations of the replace() method.
Lets say s = "a[space][space][space]b"
After iteration 1,
s = "a[space][space]"
After iteration 2,
s = "a[space][b"

Popular posts from this blog

Strip HTML tags using Python

lambda magic to find prime numbers

Convert text to ASCII and ASCII to text - Python code