Updating an Excel file in Python

December 27, 2021

Client received a few Excel files of subscribers that needed to be sorted. They’re needing to pull the last name out and put it in its own column, so they can easily look up the subscribers. The issue with the Excel files is they want to sort the subscribers by last name in alphabetical order; however, the names are only listed under full name. To make it worse the last word in that cell might not be the last name, it could be ‘JR’ or even the town ‘Chicago’. Here’s a fake name example of a bad record:

“J T & Jane Q. Smith Jr. St. Louis”

The logic behind the program (high level):

The program opens the excel file and adds the 2 new columns ‘Last Name’ and ‘First Name’ these will be the 2 columns we’ll be filling in for sorting. In the above example we’ll be getting:

Last Name: Smith; First Name: J T & Jane Q.

They have a ‘Company Name’ column and to get that to be sorted I made a straight copy of that cell over to ‘Last Name’.

To check if the city is in the ‘Full Name’ column:

To find out if the city name is in ‘Full Name’, they have a city column I originally thought I can compare the 2 but found that some people had moved since the town was added to their name. An example of this would be if from the example above would say ‘St. Louis’ in ‘Full Name’ but under town it would be ‘Chicago’. To get around this I’m pulling all the towns into a set; a set only keeps the unique names. We’ll use this to run through the ‘Full Name’ to compare if that town is in the ‘Full Name’.  

Another issue with the towns is some are 2 words like ‘St. Louis’. Since the town names were always the last. I pulled the last two words from each ‘Full Name’ and checked that against the town names if the town is found, then set it to have the last two words removed.

Also needed to do a search if there was a suffix to the name, like SR. or JR. this was a pre-made list then compared and removed if found, had to go back and add more as I found them such as a business in the wrong spot with a LTD at the end.

The main purpose for this is to get the client something that is better setup then what they had. This will save a lot of time from somebody having to do all of this manually. It will still take a person to look through the file to finalize it.

GitHub for heavily commented code = https://github.com/amacher/excelPullLastName/blob/main/PullLastName.py

*To help them quickly get through I created a column in Excel and used a RIGHT() code to pull the last name in cell information.

Related Articles

Creating our own Basic Local Chatbot

Creating our own Basic Local Chatbot

This is a current work in progress there is still some issues with the program. It will at times have its own conversation and sometimes add empty assistant tags to the replies. This is all from the first 2 links in the code. After receiving my Azura AI-900...

read more
Python – Birthday List

Python – Birthday List

Birthday List Python Program Worked on a project for a publication to pull the birthdays of their members quickly. To help demonstrate what they were working with; they had two Word files containing tables of birthday months, the dates for the month were not in order...

read more

Pin It on Pinterest

Share This