Python

Manipulating Excel Spreadsheets using Python

Manipulating Excel Spreadsheets using Python

Microsoft Excel is a spreadsheet software that is used to store and manage tabular data. Furthermore, with Excel, calculations can be carried out by applying formulas to the data, and data visualizations can be produced.Many tasks performed in spreadsheets, such as mathematical operations, can be automated via programming, and many programming languages have modules for manipulating Excel spreadsheets. In this tutorial, we will show you how to use Python's openpyxl module to read and modify Excel spreadsheets.

Installing openpyxl

Before you can install openpyxl, you must install pip. Pip is used to install Python packages. Run the following command in the command prompt to see if pip is installed.

C:\Users\windows> pip help

If the help content of pip is returned, then pip is installed; otherwise, go to the following link and download the get-pip.py file:

https://bootstrap.pypa.io/get-pip.py

Now, run the following command to install pip:

C:\Users\windows> python get-pip.py

After installing pip, the following command can be used to install openpyxl.

C:\Users\windows> pip install openpyxl

Creating an Excel Document

In this section, we will use the openpyxl module to create an Excel document. First, open the command prompt by typing 'cmd' in the search bar; then, enter

C:\Users\windows> python

To create an Excel workbook, we will import the openpyxl module and then use the 'Workbook()' method to create a workbook.

>>> # importing openpyxl module
>>> import openpyxl
>>> # Initializing a Workbook
>>> work_book = openpyxl.Workbook()
>>> # saving workbook as 'example.xlsx'
>>> work_book.save('example.xlsx')

The above commands create an Excel document called example.xlsx. Next, we will manipulate this Excel document.

Manipulating Sheets in an Excel Document

We have created an Excel document called example.xlsx. Now, we will manipulate the sheets of this document using Python. The openpyxl module has a 'create_sheet()' method that can be used to create a new sheet. This method takes two arguments: index and title. Index defines the placement of the sheet using any non-negative integer (including 0), and title is the title of the sheet. A list of all the sheets in the work_book object can be displayed by calling the sheetnames list.

>>> # importing openpyxl
>>> import openpyxl
>>> # loading existing Excel Document into work_book Object
>>> work_book = openpyxl.load_workbook('example.xlsx')
>>> # Creating a new Sheet at 0th index
>>> work_book.create_sheet(index=0, title='First Sheet')

>>> # Getting all the Sheets
>>> work_book.sheetnames
['First Sheet', 'Sheet']
>>> # Saving Excel Document
>>> work_book.save('example.xlsx')

In the above code, we created a sheet named First Sheet and placed it at 0th index. The sheet previously located at the 0th index was moved to the 1st index, as shown in the output. Now, we are going to change the name of the original sheet from Sheet to Second Sheet.

The title attribute holds the name of the sheet. To rename a sheet, we must first navigate to that sheet as follows.

>>> # Getting active sheet from Excel Document
>>> sheet = work_book.active
>>> # Printing Sheet Name
>>> print(sheet.title)
First Sheet >>> # Navigating to Second Sheet (at index 1)
>>> work_book.active = 1
>>> # Getting Active Sheet
>>> sheet = work_book.active
>>> # printing Sheet Name
>>> print(sheet.title)
Sheet >>> # Changing Sheet Title
>>> sheet.title = 'Second Sheet'
>>> # Printing Sheet title
>>> print(sheet.title)
Second Sheet

Similarly, we can remove a sheet from the Excel document. The openpyxl module offers the remove() method to remove a sheet. This method takes the name of the sheet to be removed as an argument and then removes that sheet. We can remove Second Sheet as follows:

>>> # removing a Sheet by name
>>> work_book.remove(work_book['Second Sheet'])
>>> # getting all the sheets
>>> work_book.sheetnames
['First Sheet']
>>> # saving Excel Document
>>> work_book.save('example.xlsx')

Adding Data to Cells

So far, we have showed you how to create or delete sheets in an Excel document. Now, we are going to add data to the cells of different sheets. In this example, we have a single sheet named First Sheet in our document, and we want to create two more sheets.

>>> # importing openpyxl
>>> import openpyxl
>>> # loading workbook
>>> work_book = openpyxl.load_workbook('example.xlsx')
>>> # Creating a new Sheet at 1st index
>>> work_book.create_sheet(index=1, title='Second Sheet')

>>> # creating a new Sheet at 2nd index
>>> work_book.create_sheet(index=2, title='Third Sheet')

>>> # getting all the sheets
>>> work_book.sheetnames
['First Sheet', 'Second Sheet', 'Third Sheet']

Now, we have three sheets, and we will add data to the cells of these sheets.

>>> # Getting First Sheet
>>> sheet_1 = work_book['First Sheet']
>>> # Adding Data to 'A1' Cell of First Sheet
>>> sheet_1['A1'] = 'Name'
>>> # Getting Second Sheet
>>> sheet_2 = work_book['Second Sheet']
>>> # Adding Data to 'A1' Cell of Second Sheet
>>> sheet_2['A1'] = 'ID'
>>> # Getting Third Sheet
>>> sheet_3 = work_book['Third Sheet']
>>> # Adding Data to 'A1' Cell of Third Sheet
>>> sheet_3['A1'] = 'Grades'
>>> # Saving Excel Workbook
>>> work_book.save('example.xlsx')

Reading Excel Sheets

The openpyxl module uses the value attribute of a cell to store the data of that cell. We can read the data in a cell by calling the value attribute of the cell. Now, we have three sheets, and each sheet contains some data. We can read the data by using following functions in openpyxl:

>>> # importing openpyxl
>>> import openpyxl
>>> # loading workbook
>>> work_book = openpyxl.load_workbook('example.xlsx')
>>> # Getting First Sheet
>>> sheet_1 = work_book['First Sheet']
>>> # Getting Second Sheet
>>> sheet_2 = work_book['Second Sheet']
>>> # Getting Third Sheet
>>> sheet_3 = work_book['Third Sheet']
>>> # printing data from 'A1' cell of First Sheet
>>> print(sheet_1['A1'].value)
Name
>>> # printing data from 'A1' cell of Second Sheet
>>> print(sheet_2['A1'].value)
ID
>>> # printing data from 'A1' cell of Third Sheet
>>> print(sheet_3['A1'].value)
Grades

Changing Fonts and Colors

Next, we are going to show you how to change the font of a cell by using the Font() function. First, import the openpyxl.styles object. The Font() method takes a list of arguments, including:

  • name (string): the name of the font
  • size (int or float): the size of the font
  • underline (string): the underline type
  • color (string): the hexadecimal color of the text
  • italic (bool): whether the font is italicized
  • bold (bool): whether the font is bolded

To apply styles, we must first create an object by passing all the parameters to the Font() method. Then, we select the sheet, and inside the sheet, we select the cell to which we want to apply the style. Then, we apply style to the selected cell.

>>> # importing openpyxl
>>> import openpyxl
>>> # importing Font method from openpyxl.styles
>>> from openpyxl.styles import Font
>>> # loading workbook
>>> work_book = openpyxl.load_workbook('example.xlsx')
>>> # Creating style object
>>> style = Font(name='Consolas', size=13, bold=True,
… italic=False)
>>> # Selecting Sheet from Workbook
>>> sheet_1 = work_book['First Sheet']
>>> # Selecting the cell we want to add styles
>>> a1 = sheet_1['A1']
>>> # Applying Styles to the cell
>>> a1.font = style
>>> # Saving workbook
>>> work_book.save('example.xlsx')

Applying Borders to Cells

We can apply borders to the cells in an Excel sheet using the Border() and Side() methods of the openpyxl.styles.borders module. We can pass different functions as the parameters to the Border() method. The following are some of the functions that are passed as parameters to the Border() method to define the dimensions of the border.

  • left: apply a border to the left side of a cell
  • right: apply a border to the right side of a cell
  • top: apply a border to the top of a cell
  • bottom: apply a border to the bottom of a cell

These functions take style attributes as parameters. The style attribute defines the style of the border (e.g., solid, dashed). Style parameters can have any one of the following values.

  • double: a double line border
  • dashed: a dashed border
  • thin: a thin border
  • medium: a medium border
  • mediumDashDot: a dashed and dotted border of medium weight
  • thick: a thick border
  • dashDot: a dashed and dotted border
  • hair: a very thin border
  • dotted: a dotted border

Now, we will apply different types of borders to different cells of our spreadsheets. First, we select cells, and then, we define border styles and apply these styles to different cells.

>>> # importing openpyxl
>>> import openpyxl
>>> # importing Border and Side classes
>>> from openpyxl.styles.borders import Border, Side
>>> # loading workbook
>>> work_book = openpyxl.load_workbook('example.xlsx')
>>> # Selecting Sheet
>>> sheet_1 = work_book['First Sheet']
>>> # Selecting different cells from sheet
>>> cell_1 = sheet_1['A1']
>>> cell_2 = sheet_1['B2']
>>> cell_3 = sheet_1['C3']
>>> # Defining different border styles
>>> style_1 = Border(bottom=Side(style='dotted'))
>>> style_2 = Border(right=Side(style='thin'))
>>> style_3 = Border(top=Side(style='dashDot'))
>>> # applying border styles to the cells
>>> cell_1.border = style_1
>>> cell_2.border = style_2
>>> cell_3.border = style_3
>>> # Saving workbook
>>> work_book.save('example.xlsx')

Adjusting Row and Column Dimensions

The row height and column width of an Excel document can also be adjusted using Python. The openpyxl module has two built-in methods that can be used to perform these actions. First, we select the sheet of which we want to change the column width or row height. Then, we apply a method to the specific row or column.

>>> # importing openpyxl
>>> import openpyxl
>>> # loading workbook
>>> work_book = openpyxl.load_workbook('example.xlsx')
>>> # selecting sheet
>>> sheet_1 = work_book['First Sheet']
>>> # changing the height of first row
>>> sheet_1.row_dimensions[1].height = 50
>>> # Saving workbook
>>> work_book.save('example.xlsx')

Similarly, we can change the width of a column using the following code

>>> # selecting sheet from excel workbook
>>> sheet_2 = work_book['Second Sheet']
>>> # changing the width of A column
>>> sheet_2.column_dimensions['A'].width = 50
>>> # Saving workbook
>>> work_book.save('example.xlsx')

The above code will change the height of the first row to 50 pts and the width of column A to 50 pts.

Merging and Unmerging Cells

When working with Excel spreadsheets, we often need to merge and unmerge cells. To merge cells in Python, a simple function based on openpyxl can be used. The openpyxl module offers the merge_cells() method, which can be used to merge cells in Excel. The new cell will take on the name of the top left cell. For example, if we want to merge the cells from cell A1 to cell B2, then the newly formed cell will be referred to as A1. To merge cells using openpyxl, we first select the sheet, and then we apply the merge_cells() method to the sheet.

>>> # importing openpyxl module
>>> import openpyxl
>>> # loading workbook
>>> work_book = openpyxl.load_workbook('example.xlsx')
>>> # selecting first sheet from excel workbook
>>> sheet_1 = work_book['First Sheet']
>>> # merging cells from A1 to B2 in Sheet 1
>>> sheet_1.merge_cells('A1:B2')
>>> # saving workbook
>>> work_book.save('example.xlsx')

Similarly, the unmerge_cells() method can be used to unmerge cells in an Excel spreadsheet. The following code can be used to unmerge cells:

>>> # selecting sheet from workbook
>>> sheet_1 = work_book['First Sheet']
>>> # unmerging cells from A1 to B2
>>> sheet_1.unmerge_cells('A1:B2')
>>> # saving workbook
>>> work_book.save('example.xlsx')

Conclusion

Excel spreadsheets are commonly used for data manipulation. However, such tasks can be monotonous. Therefore, in such cases, programming can be used to automate spreadsheet manipulation.

In this article, we discussed some of the useful functions of Python's openpyxl module. We showed you how to create, read, remove and modify Excel spreadsheets, how to change the style, apply font, borders, and dimensions of cells, and how to merge and unmerge cells. By applying these functions, you can automate many spreadsheet manipulation tasks using Python.

Як встановити League of Legends на Ubuntu 14.04
Якщо ви шанувальник League of Legends, то це можливість для вас тестувати League of Legends. Зверніть увагу, що LOL підтримується на PlayOnLinux, якщо...
Встановіть останню стратегічну гру OpenRA на Ubuntu Linux
OpenRA - це ігровий движок Libre / Free Real Time Strategy, який відтворює ранні ігри Вествуда, такі як класичний Command & Conquer: Red Alert. Пошире...
Встановіть найновіший емулятор Dolphin для Gamecube & Wii на Linux
Емулятор Dolphin дозволяє грати у вибрані вами ігри Gamecube та Wii на персональних комп’ютерах Linux (ПК). Будучи вільно доступним і відкритим ігров...