Data Analysis with Lee Hawthorn

Python Text Tips

Topics: Python

Python is great for cleaning text in data pipelines. There are so many helper functions I thought I'd put together a list of functions to manipulate text. It can be tricky to remember all the functions available. Python is a highly mature language in 2021 with multiple ways to achieve the same thing.


"100".isnumeric() =  True
"100t".isnnumeric() =  False
"2abc".isalpha() = False
"abc".isalpha() = True
"Very Good !".endswith("!")  = True
"1. Hello".startswith("1") = True
int("100") = 100
" Hello ".lstrip() = "Hello "
" Hello ".rstrip() = " Hello"
" Hello ".rstrip() = "Hello"
"Learn Python".casefold() = "learn python"
"LEArn python".title() = "Learn Python"
"TestPython".removeprefix("Test") = "Python"
"TestPython".removesuffix("Python") = "Test"
"1,2,3".split(',')  =  ['1','2','3']
"1-100-2-Python-Is-Great".split('-', maxsplit=2) = ['1', '100', '2-Python-Is-Great'] |
"Py" in "Python" = True
"Python 100".find("100") = 7

points = 19
total = 22
print('Correct answers: {:.2%}'.format(points/total))
print(f'Correct answers: {points/total:.0%}'.format(points/total))
Correct answers: 86.36%
Correct answers: 86%


nums = [2,4,2,4,1,3,3]
distinct_nums =     set(nums)
{1, 2, 3, 4}


import re
data = "First Name: Bob Last Name: Dylan"
reg = re.compile(r'First Name: (.*) Last Name: (.*)')
match = reg.search(data)
match.group(1)
match.group(2)

Bob
Dylan
Previous PostHandling deeply nested JSON with Python
Next Post

© 2020 Lee Hawthorn

This site is built with GatsbyJS