Pandas query find()
The find function is a great way to evaluate if a value is found within a text string. For example, I want to find all entries where email is missing "@" - I know it sounds very simple, but this is an example applicable to many different situations. All you need to do is replace the "@" with a character of your choice:
import pandas as pd
# create a sample DataFrame
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Email': ['alice@example.com', 'bob@example.com', 'charlie@example.com']
})
# find the position of the '@' character in each email address
df['Email Position'] = df['Email'].str.find('@')
# display the updated DataFrame
print(df)
In this example, we first use the find() method to find the position of the '@' character in each email address in the 'Email' column. We then create a new column called 'Email Position' to store the output of the find() method.
Next, we use boolean indexing to create a filtered DataFrame called df_filtered that only contains rows where the 'Email Position' column is not equal to -1. This effectively filters out any rows where the '@' character was not found in the email address.
Link to Google Collab example.
Comments
Post a Comment