Menu
Python Regular Expression: Exercise-47 with Solution. Note: A delimiter is a sequence of one or more characters used to specify the boundary between separate, independent regions in plain text or other data streams. An example of a delimiter is the comma character, which acts as a field delimiter in a sequence of comma-separated values.
Active9 months ago
Here's the simplest way to explain this. Here's what I'm using:
Here's what I want:
The reason is that I want to split a string into tokens, manipulate it, then put it back together again.
Ken KinderKen Kinder6,16755 gold badges3030 silver badges5656 bronze badges
10 Answers
Commodore JaegerCommodore Jaeger24.4k44 gold badges5050 silver badges4444 bronze badges
If you are splitting on newline, use
splitlines(True)
.(Not a general solution, but adding this here in case someone comes here not realizing this method existed.)
Mark LodatoMark Lodato30.8k55 gold badges3434 silver badges3030 bronze badges
Another no-regex solution that works well on Python 3
ootwchootwch
If you have only 1 separator, you can employ list comprehensions:
Appending/prepending separator:
Separator as it's own element:
GranitosaurusGranitosaurus13.2k22 gold badges3131 silver badges4848 bronze badges
another example, split on non alpha-numeric and keep the separators
output:
explanation
anuraganurag
You can also split a string with an array of strings instead of a regular expression, like this:
Anderson GreenAnderson Green11.8k4242 gold badges131131 silver badges264264 bronze badges
Moisey OysgeltMoisey Oysgelt
If one wants to split string while keeping separators by regex without capturing group:
If one assumes that regex is wrapped up into capturing group:
Both ways also will remove empty groups which are useless and annoying in most of the cases.
Dmitriy SintsovDmitriy Sintsov
One Lazy and Simple Solution
Assume your regex pattern is
split_pattern = r'(!|?)'
First, you add some same character as the new separator, like '[cut]'
new_string = re.sub(split_pattern, '1[cut]', your_string)
Then you split the new separator,
new_string.split('[cut]')
Yilei WangYilei Wang
I had a similar issue trying to split a file path and struggled to find a simple answer.This worked for me and didn't involve having to substitute delimiters back into the split text:
my_path = 'folder1/folder2/folder3/file1'
import re
re.findall('[^/]+/|[^/]+', my_path)
returns:
['folder1/', 'folder2/', 'folder3/', 'file1']
ConorConor