Find Duplicates from a Text Log File using Microsoft Excel

I have a Text log file containing JSON data. This text file contains many duplicate records. I need to find unique records among thousands of rows.

We can achieve this using Microsoft Excel by importing the text file JSON records. Below is what my text log file content looks like.

 "{"info":{"osver":"IOS","browserver":"Safari 15","brand":"Apple Computer, Inc.","extra":"starltexx","libver":"2.21.8.9"},"type":"webdevinfo"}" , 1 )
 "{"info":{"osver":"IOS","browserver":"Safari 15","brand":"Apple Computer, Inc.","extra":"starltexx","libver":"2.21.8.9"},"type":"webdevinfo"}" , 2 )
 "{"info":{"osver":"IOS","browserver":"Safari 15","brand":"Apple Computer, Inc.","extra":"starltexx","libver":"2.21.8.9"},"type":"webdevinfo"}" , 4 )
 "{"info":{"osver":"IOS","browserver":"Safari 15","brand":"Apple Computer, Inc.","extra":"starltexx","libver":"2.21.8.9"},"type":"webdevinfo"}" , 1 )
 "{"info":{"osver":"IOS","browserver":"Safari 15","brand":"Apple Computer, Inc.","extra":"starltexx","libver":"2.21.8.9"},"type":"webdevinfo"}" , 2 )
 "{"info":{"osver":"IOS","browserver":"Safari 15","brand":"Apple Computer, Inc.","extra":"starltexx","libver":"2.21.8.9"},"type":"webdevinfo"}" , 4 )
 "{"info":{"osver":"IOS","browserver":"Safari 15","brand":"Apple Computer, Inc.","extra":"starltexx","libver":"2.21.8.9"},"type":"webdevinfo"}" , 1 )
 "{"info":{"osver":"IOS","browserver":"Safari 15","brand":"Apple Computer, Inc.","extra":"starltexx","libver":"2.21.8.9"},"type":"webdevinfo"}" , 2 )
 "{"info":{"osver":"IOS","browserver":"Safari 15","brand":"Apple Computer, Inc.","extra":"starltexx","libver":"2.21.8.9"},"type":"webdevinfo"}" , 4 )
 "{"info":{"osver":"win10","browserver":"Chrome 99","brand":"Google Inc.","extra":"starltexx","libver":"2.21.8.9"},"type":"webdevinfo"}" , 1 )
 "{"info":{"osver":"win10","browserver":"Chrome 99","brand":"Google Inc.","extra":"starltexx","libver":"2.21.8.9"},"type":"webdevinfo"}" , 2 )
 "{"info":{"osver":"win10","browserver":"Chrome 99","brand":"Google Inc.","extra":"starltexx","libver":"2.21.8.9"},"type":"webdevinfo"}" , 4 )

In the above example, I can see 2 unique records among many records. It will be difficult to take the unique records manually.

Let us see how we can remove duplicates and keep unique records in the text log file using Microsoft Excel. I have a file with an extension .txt containing logs. Open Microsoft Excel and go to Open -> Browse and select your text file.

Remember, analyze your data first. This will help you perform the next steps easier. The steps I am going to tell you worked for me, for your case it will be different depending on the type and structure of the data in the text file.

You will not be able to see the text files when you browse. Remember to change the file type to All Files.

A Text Import Wizard will be opened with your data and a few other options. Select Delimited. I will be separating my row into different columns using a delimiter. Click Next.

In the next step of the Text Import Wizard, select the delimiter based on your type of data. I want to separate my row using comma and colon. You can see the preview when you select your delimiters. My requirement is to get a unique OS and browser from the data. Click on Next.

On the last step of your Text Import Wizard, you can specify the data format for your columns. The default is General. General converts numeric values to numbers, date values to dates, and all remaining values to text. Click on Finish.

I am only interested in unique C and E columns. Therefore, I will select the range from C to E with all rows selected.

After selecting the range, go to Data and click on Remove Duplicates under Data Tools. I am working on the latest Microsoft 365 Excel as of today.

Uncheck the non-required columns and click on OK. You can keep it checked as well, will not make difference if all the values in Column D are the same. But, a good practice is to keep only the required columns.