Updating an Excel file in Python

December 27, 2021

Client received a few Excel files of subscribers that needed to be sorted. They’re needing to pull the last name out and put it in its own column, so they can easily look up the subscribers. The issue with the Excel files is they want to sort the subscribers by last name in alphabetical order; however, the names are only listed under full name. To make it worse the last word in that cell might not be the last name, it could be ‘JR’ or even the town ‘Chicago’. Here’s a fake name example of a bad record:

“J T & Jane Q. Smith Jr. St. Louis”

The logic behind the program (high level):

The program opens the excel file and adds the 2 new columns ‘Last Name’ and ‘First Name’ these will be the 2 columns we’ll be filling in for sorting. In the above example we’ll be getting:

Last Name: Smith; First Name: J T & Jane Q.

They have a ‘Company Name’ column and to get that to be sorted I made a straight copy of that cell over to ‘Last Name’.

To check if the city is in the ‘Full Name’ column:

To find out if the city name is in ‘Full Name’, they have a city column I originally thought I can compare the 2 but found that some people had moved since the town was added to their name. An example of this would be if from the example above would say ‘St. Louis’ in ‘Full Name’ but under town it would be ‘Chicago’. To get around this I’m pulling all the towns into a set; a set only keeps the unique names. We’ll use this to run through the ‘Full Name’ to compare if that town is in the ‘Full Name’.  

Another issue with the towns is some are 2 words like ‘St. Louis’. Since the town names were always the last. I pulled the last two words from each ‘Full Name’ and checked that against the town names if the town is found, then set it to have the last two words removed.

Also needed to do a search if there was a suffix to the name, like SR. or JR. this was a pre-made list then compared and removed if found, had to go back and add more as I found them such as a business in the wrong spot with a LTD at the end.

The main purpose for this is to get the client something that is better setup then what they had. This will save a lot of time from somebody having to do all of this manually. It will still take a person to look through the file to finalize it.

GitHub for heavily commented code = https://github.com/amacher/excelPullLastName/blob/main/PullLastName.py

*To help them quickly get through I created a column in Excel and used a RIGHT() code to pull the last name in cell information.

Related Articles

Python – Birthday List

Python – Birthday List

Birthday List Python Program Worked on a project for a publication to pull the birthdays of their members quickly. To help demonstrate what they were working with; they had two Word files containing tables of birthday months, the dates for the month were not in order...

read more
Microsoft Certified: Azure AI Fundamentals

Microsoft Certified: Azure AI Fundamentals

Passed the Microsoft Certified: Azure AI Fundamentals Certification:https://learn.microsoft.com/api/credentials/share/en-us/LynnAmacher-1690/4DB4563B79AA3C11?sharingId=9196330BBB36EAF4 The graphic was created in Microsoft Image creator with DALL-E where I added the...

read more
AI as your assistant

AI as your assistant

With all the talk of AI will be taking your job and articles like you don’t need to learn coding AI will do it all for you. I’ve been playing around with AI, decided to try to see how much of what AI can be done for me. I use AI generical throughout the post and not...

read more

Pin It on Pinterest

Share This