5 Common Excel Tasks Simplified with Python
用Python简化的5个常用Excel任务

唐羽曦    华北电力大学
时间:2025-06-20 语向:英-中 类型:人工智能 字数:1050
  • 5 Common Excel Tasks Simplified with Python
    用Python简化的5个常用Excel任务
  • 5 Common Excel Tasks Simplified with Python
    用Python简化的5个常用Excel任务
  • Using Pandas, OS, and the datetime module to simplify Excel tasks with Python.
    使用Pandas,OS和datetime模块用Python简化Excel任务。
  • One of my first jobs involved doing simple Excel tasks that at scale got tedious and time-consuming. I could do all that stuff manually and become an expert in doing mundane tasks, but I decided to learn Python to simplify them instead.
    我最初的工作之一是做一些简单的Excel任务,这些任务在规模上变得单调乏味,费时费力。我可以手动完成所有这些工作,并成为一个做普通任务的专家,但我决定学习Python来简化它们。
  • Excel is a good tool for working with data but has some limitations when it comes to working with hundreds of files that need to be modified. Here’s where Python can help.
    Excel是处理数据的好工具,但在处理数百个需要修改的文件时有一些限制。下面是Python可以提供帮助的地方。
  • In this article, we will simplify 5 Excel tasks that I (and probably you) often had to do at work.
    在本文中,我们将简化我(可能还有你)在工作中经常不得不做的5项Excel任务。
  • First things first
    先做重要的事
  • Install The Libraries
    安装库
  • To simplify these Excel tasks with Python I will use Pandas, OS, and datetime. The last two libraries come by default with Python, while Pandas need to be installed. To do so, write the following command in your terminal.
    为了用Python简化这些Excel任务,我将使用Pandas,OS和DateTime。最后两个库默认随Python一起提供,而Pandas则需要安装。为此,请在终端中编写以下命令。
  • In this article, I will assume you know how to import and export an Excel file using Pandas. But in case you don’t know how to it, here’s how:
    在本文中,我将假设您知道如何使用Pandas导入和导出Excel文件。但如果你不知道怎么做,下面是怎么做的:
  • The Data
    这些数据
  • For this guide, I’m using 2 Excel files that are located in the same directory my Python script is. The spreadsheets contain dummy data that looks like this.
    在本指南中,我使用了两个Excel文件,它们位于Python脚本所在的同一个目录中。电子表格包含如下所示的虚拟数据。
  • The data isn’t so relevant. The main idea is that all the data is divided into multiple chunks (Book1, Book2, Book3, …) and our task is to perform operations to all of them.
    这些数据没有那么重要。主要思想是将所有数据分成多个块(Book1,Book2,Book3,…),我们的任务是对所有数据执行操作。
  • You can create your own dummy data, use your own Excel files or download these two files available on my Github. Put those files in the same directory your Python script is located.
    你可以创建你自己的虚拟数据,使用你自己的Excel文件或者下载我的GitHub上可用的这两个文件。将这些文件放在Python脚本所在的同一个目录中。
  • 1) Concatenate Multiple Excel Files with Python
    1)用Python连接多个Excel文件
  • Our first task is to combine all these chunks in one Excel file (say we want to share all the data with our boss).
    我们的第一个任务是将所有这些块组合在一个Excel文件中(比如我们想与老板共享所有数据)。
  • If there are 2 Excel files, doing this will be as easy as copying the data in one file and pasting it into the other file; however, doing this with hundreds of files becomes impractical.
    如果有2个Excel文件,这样做就像复制一个文件中的数据并粘贴到另一个文件中一样容易;然而,对数百个文件这样做变得不切实际。
  • A simple solution to this will be using Pandas pd.concat().
    一个简单的解决方案是使用Pandas pd.concat()。
  • First, we list all the files in the current directory with os.listdir(). Then we filter files out with the .endswith(‘.xlsx’) method.
    首先,我们用os.listdir()列出当前目录中的所有文件。然后我们使用。endswith('.xlsx')方法筛选出文件。
  • All the Excel files will be stored into a list called my_list and then combined with pd.concat() that return a dataframe we name df_concat. This dataframe is exported to an Excel file.
    所有Excel文件都将存储在一个名为my_list的列表中,然后与pd.concat()组合,后者返回一个我们命名为df_concat的数据框架。此数据框架将导出到Excel文件。
  • Note that you can use read_csv() and to_csv(), in case you’re working with comma-separated values (CSV files)
    注意,您可以使用read_csv()和to_csv(),以防使用逗号分隔的值(CSV文件)
  • 2) Change Name or Extension of Multiple CSV Files with Python
    2)使用Python更改多个CSV文件的名称或扩展名
  • CSV is a common file extension that is used for storing and reading data. It can be opened with Excel and is widely used at work.
    CSV是用于存储和读取数据的通用文件扩展名。它可以用Excel打开,在工作中应用广泛。
  • I often had the task to rename CSV files so anyone could easily identify them (for example, adding the date at the beginning of the name) and also had to convert .csv files to .txt files and vice-versa,
    我经常需要重命名CSV文件,以便任何人都能容易地识别它们(例如,在名称的开头添加日期),并且还必须将。CSV文件转换为。txt文件,反之亦然,
  • Both are simple tasks, but they become a headache when working with hundreds of files. Fortunately, I managed to do these tasks in a couple of seconds using the .split() and os.rename() methods as shown in the code below.
    这两者都是简单的任务,但在处理数百个文件时却变得令人头疼。幸运的是,我使用。split()和os.rename()方法在几秒钟内完成了这些任务,如下面的代码所示。
  • First, we have to use the datetime module to obtain the current date and format it as we want (in this case year, month, day).
    首先,我们必须使用datetime模块获取当前日期,并按我们所需的格式设置它的格式(在本例中为年,月,日)。
  • Then we filter out files other than CSV and use the .split('.') to split the file name every time it reaches a dot sign (.), so we separate the extension from the name of the file. We concatenate names with the f-string (f’’) and rename the file with os.rename(). This takes the current file as a first argument and the new name as second argument
    然后我们筛选出CSV以外的文件,并使用。split(“。”) 为了在文件名每次到达点号(。)时拆分它,所以我们将扩展名与文件名分开。我们将名称与F字符串(F“)连接起来,并用os.rename()重命名文件。这将当前文件作为第一个参数,新名称作为第二个参数
  • 3) Using Excel’s String Functions LEFT-RIGHT in Python
    3)在Python中从左向右使用Excel的字符串函数
  • Imagine you have to add the country code to each phone number of the “Number” column in the spreadsheet we’ve seen before.
    想象一下,您必须将国家代码添加到我们之前看到的电子表格中“数字”列的每个电话号码中。
  • We could execute Excel’s LEFT function to each file, but that could take a lot of time. Instead, we can use a for loop, read each Excel file into a Pandas dataframe and then use lambda and f-string to add the country code to each phone number.
    我们可以对每个文件执行Excel的LEFT函数,但这可能会花费很多时间。相反,我们可以使用for循环,将每个Excel文件读入Pandas数据框架,然后使用lambda和f-string将国家代码添加到每个电话号码。
  • If we use theapply andlambda functions together, we can modify the value inside a column. In this case, I added the “44” code at the beginning of each value within the column “Number.” Note that we have to make sure the values in the column “Number” are string (str), so we used the .astype() method to change data types.
    如果我们一起使用apply和lambda函数,我们可以修改列中的值。在本例中,我在列“number”中的每个值的开头添加了“44”代码。注意,我们必须确保列“number”中的值是string(str),因此我们使用。astype()方法来更改数据类型。
  • Finally, we exported the dataframe to a new Excel file that contains the word “country_code” in the name.
    最后,我们将数据框架导出到一个新的Excel文件中,该文件的名称中包含单词“country_code”。
  • 4) Remove Duplicates Within a Column and Drop NaN Values
    4)删除列中的重复项并删除NaN值
  • Some basic data cleaning performed in Excel like dropping duplicates and NaN values can be easily done with Pandas.
    在Excel中执行的一些基本数据清理,如删除重复项和NaN值,可以很容易地用Pandas完成。
  • You can easily get rid of duplicate rows based on all columns using .drop_duplicates().
    您可以使用.drop_duplicates()轻松地删除基于所有列的重复行。
  • But in case you want to remove duplicates on a specific column add the subset argument. Let’s say we want to drop duplicated within the “Number” column.
    但是,如果要删除特定列上的重复项,请添加子集参数。假设我们希望删除“Number”列中的重复项。
  • On the other hand, dropping NaN values can be achieved with .dropna()
    另一方面,可以使用。dropNA()实现删除NaN值
  • You can also replace null values instead of dropping the whole row. Let’s say we want to replace null values with zero.
    也可以替换空值,而不是删除整行。假设我们想用零替换空值。
  • In addition to that, it’s possible to fill those null values with values above (forward filling) or below (backward filling) a row.
    除此之外,还可以使用行以上(向前填充)或行以下(向后填充)的值填充这些空值。
  • 5) Sort Values
    5)排序值
  • Sorting a Pandas dataframe is as easy as sorting an Excel spreadsheet. We only have to use the .sort_values() and specify either ascending or descending (ascending=True, ascending=False) and also the column(s) to sort by.
    对Pandas数据框架进行排序就像对Excel电子表格进行排序一样简单。我们只需使用.sort_values()并指定升序或降序(ascending=true,ascending=false)以及排序依据的列。
  • In case we want to save the changes without having to set the value equal to df as shown above, we can use the inplace argument
    如果我们想要保存更改而不必将值设置为等于df(如上所示),我们可以使用inplace参数
  • That’s it! Those are 5 common tasks we might get at our job and can be simplified in Python. If you’re an Excel user and want to move to Python/Pandas, check the guide below.
    就是这样!这是我们在工作中可能遇到的5个常见任务,可以在Python中进行简化。如果您是一个Excel用户,想要转到Python/Pandas,请查看下面的指南。
  • Join my email list with 3k+ people to get my Python for Data Science Cheat Sheet I use in all my tutorials (Free PDF)
    加入我的电子邮件列表与3K+的人,以获得我的Python为数据科学备忘表,我在我的所有教程(免费PDF)
  • If you enjoy reading stories like these and want to support me as a writer, consider signing up to become a Medium member. It’s $5 a month, giving you unlimited access to stories on Medium. If you sign up using my link, I’ll earn a small commission with no extra cost to you.
    如果你喜欢阅读像这样的故事,并且想支持我成为一个作家,可以考虑注册成为一个Medium会员。一个月5美元,你可以无限制地访问媒体上的故事。如果你注册使用我的链接,我将赚取一个小的佣金,没有额外的成本给你。

400所高校都在用的翻译教学平台

试译宝所属母公司