Updating an Excel file in Python

December 27, 2021

Client received a few Excel files of subscribers that needed to be sorted. They’re needing to pull the last name out and put it in its own column, so they can easily look up the subscribers. The issue with the Excel files is they want to sort the subscribers by last name in alphabetical order; however, the names are only listed under full name. To make it worse the last word in that cell might not be the last name, it could be ‘JR’ or even the town ‘Chicago’. Here’s a fake name example of a bad record:

“J T & Jane Q. Smith Jr. St. Louis”

The logic behind the program (high level):

The program opens the excel file and adds the 2 new columns ‘Last Name’ and ‘First Name’ these will be the 2 columns we’ll be filling in for sorting. In the above example we’ll be getting:

Last Name: Smith; First Name: J T & Jane Q.

They have a ‘Company Name’ column and to get that to be sorted I made a straight copy of that cell over to ‘Last Name’.

To check if the city is in the ‘Full Name’ column:

To find out if the city name is in ‘Full Name’, they have a city column I originally thought I can compare the 2 but found that some people had moved since the town was added to their name. An example of this would be if from the example above would say ‘St. Louis’ in ‘Full Name’ but under town it would be ‘Chicago’. To get around this I’m pulling all the towns into a set; a set only keeps the unique names. We’ll use this to run through the ‘Full Name’ to compare if that town is in the ‘Full Name’.  

Another issue with the towns is some are 2 words like ‘St. Louis’. Since the town names were always the last. I pulled the last two words from each ‘Full Name’ and checked that against the town names if the town is found, then set it to have the last two words removed.

Also needed to do a search if there was a suffix to the name, like SR. or JR. this was a pre-made list then compared and removed if found, had to go back and add more as I found them such as a business in the wrong spot with a LTD at the end.

The main purpose for this is to get the client something that is better setup then what they had. This will save a lot of time from somebody having to do all of this manually. It will still take a person to look through the file to finalize it.

GitHub for heavily commented code = https://github.com/amacher/excelPullLastName/blob/main/PullLastName.py

*To help them quickly get through I created a column in Excel and used a RIGHT() code to pull the last name in cell information.

Related Articles

Website Contact form using AWS

Website Contact form using AWS

For the next iteration of the HTML form was to set it up to send an email to the web owner using AWS. The purpose behind this is, if you have a HTML site and don’t want to purchase a monthly plan or build and maintain server software. Using AWS for this functionality...

read more
HTML with JavaScript contact form

HTML with JavaScript contact form

Link to the form: https://www.lynnamacher.com/pdf/contactForm_V1.html This is created to be basic test for an AWS contact form test. The full form when filled out will send an email through AWS to the site owner. Instead of creating the normal test form I try to...

read more
Sometimes the low-tech approach is the best way

Sometimes the low-tech approach is the best way

Have a family member looking to get me some family videos they had digitized, in one big file, about 80GB (they may not have optimized or know how). The need: The family member’s internet connection isn’t the best and could get disconnected at times, when this happens...

read more

Pin It on Pinterest

Share This