Creating a graph in excel

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_excel(“yield vs CWP_original.xlsx”, skiprows=[1], usecols=[1, 4])

x = df[“CWP”]
y = df[“YIELD”]

plt.scatter(x, y)
plt.xlabel(‘CWP’)
plt.ylabel(‘YIELD’)
plt.title(‘Yield vs CWP’)
plt.show()

I have got this code and it runs perfect but it shows the scatterplot in the window. I am wanting to try create the plot in excel and save it in the same sheet that the data is extracted from. Any idea what kind of code I can use to do this?

Hi Simon,

You could try the code below and that will save the image in the excel file and give you a separate image file also.

import pandas as pd
import matplotlib.pyplot as plt
import openpyxl
from openpyxl.utils.dataframe import dataframe_to_rows
from openpyxl.drawing.image import Image

# Read the data from Excel file
df = pd.read_excel("yield vs CWP_original.xlsx", skiprows=[1], usecols=[1, 4])

x = df["CWP"]
y = df["YIELD"]

# Create the scatter plot
plt.scatter(x, y)
plt.xlabel('CWP')
plt.ylabel('YIELD')
plt.title('Yield vs CWP')

# Save the plot as an image
plot_image_path = "scatter_plot.png"
plt.savefig(plot_image_path)

# Load the Excel file using openpyxl
wb = openpyxl.load_workbook("yield vs CWP_original.xlsx")
sheet = wb.active

# Insert the image in the Excel sheet
img = Image(plot_image_path)
sheet.add_image(img, "E2")  # You can change the cell reference (E2) as per your preference

# Save the modified Excel file
wb.save("yield vs CWP_original.xlsx")

# Close the plot
plt.close()

great thank you. It worked perfectly. is one able to save it not as a picture but but as a normal graph that can be edited in excel?

Hi simon, you could try the modified code below to make the graph into an excel graph but it might not work in some versions of excel (office365 web version) and potentially some older desktop versions.

import pandas as pd
import matplotlib.pyplot as plt
import openpyxl
from openpyxl import Workbook
from openpyxl.utils.dataframe import dataframe_to_rows
from openpyxl.chart import ScatterChart, Reference, Series

# Read the data from Excel file
df = pd.read_excel("yield vs CWP_original.xlsx", skiprows=[1], usecols=[1, 4])

x = df["CWP"]
y = df["YIELD"]

# Create the scatter plot using matplotlib (Optional for visualization)
plt.scatter(x, y)
plt.xlabel('CWP')
plt.ylabel('YIELD')
plt.title('Yield vs CWP')
plt.show()

# Create a new Excel workbook and add the data
wb = Workbook()
sheet = wb.active
for r in dataframe_to_rows(df, index=False, header=True):
    sheet.append(r)

# Create the scatter chart
chart = ScatterChart()
chart.title = "Yield vs CWP"
xvalues = Reference(sheet, min_col=1, min_row=2, max_row=len(df) + 1, max_col=1)
yvalues = Reference(sheet, min_col=2, min_row=1, max_row=len(df) + 1, max_col=2)
series = Series(yvalues, xvalues, title_from_data=True)
chart.series.append(series)
sheet.add_chart(chart, "E2")  # You can change the cell reference (E2) as per your preference

# Save the Excel file
wb.save("yield vs CWP_original_with_chart.xlsx")

awesome thank you. It creates a scatterchart with lines. I am guessing that may be due to excel? I have microsoft 360.

I think so, if you see the link below the scatter plot has the same issue. Maybe take a look at the bubble chart at the top and try to incorporate the syntax for that into your code

link

I am trying to add statistics to my charts. Now the first image does this fine. the second image you cant see the markers but they are there. How does one make this markers visible. Any ideas what I am doing wrong?


import pandas as pd
import matplotlib.pyplot as plt
import openpyxl
from openpyxl.utils.dataframe import dataframe_to_rows
from openpyxl.chart import ScatterChart, Reference, Series
from openpyxl import Workbook
import scipy.stats as stats

Read the data from Excel file

df = pd.read_excel(“yield vs CWP_original.xlsx”, skiprows=[1], usecols=[1, 4])

x = df[“CWP”]
y = df[“YIELD”]

Perform linear regression

slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)
r_squared = r_value ** 2

Create the scatter plot with marker options

plt.scatter(x, y, marker=‘o’, label=‘Data Points’, color=‘blue’)
plt.plot(x, slope * x + intercept, color=‘red’, label=‘Regression Line’)
plt.xlabel(‘CWP’)
plt.ylabel(‘YIELD’)
plt.title(‘Yield vs CWP’)

Add equation and R-squared value to the graph

equation_text = f"YIELD = {slope:.2f} * CWP + {intercept:.2f}"
r_squared_text = f"R-squared = {r_squared:.4f}"
plt.text(0.05, 0.95, equation_text, transform=plt.gca().transAxes,
fontsize=12, verticalalignment=‘top’)
plt.text(0.05, 0.88, r_squared_text, transform=plt.gca().transAxes,
fontsize=12, verticalalignment=‘top’)
plt.show()

Create a new Excel workbook and add the data

wb = Workbook()
sheet = wb.active
for r in dataframe_to_rows(df, index=False, header=True):
sheet.append(r)

Create the scatter chart

chart = ScatterChart()
chart.title = “Yield vs CWP”
xvalues = Reference(sheet, min_col=1, min_row=2,
max_row=len(df) + 1, max_col=1)
yvalues = Reference(sheet, min_col=2, min_row=1,
max_row=len(df) + 1, max_col=2)
series = Series(yvalues, xvalues, title_from_data=True)
chart.series.append(series)

Add equation and R-squared value as text on the chart

sheet.cell(row=len(df) + 4, column=1, value=equation_text)
sheet.cell(row=len(df) + 5, column=1, value=r_squared_text)

You can change the cell reference (E2) as per your preference

sheet.add_chart(chart, “J2”)

Save the modified Excel file

wb.save(“yield vs CWP_original_with_chart.xlsx”)

When you create the scatter you could pass some keyword args like below, that might do the trick.

plt.scatter(x, y, marker='o', label='Data Points', facecolor='blue', edgecolor='blue')

so I have everything working. The chart is plotting in excel although as lines but this seems more a excel and python problem. I have added statistics like R2 and the equation and this can be plotted on the chart in the window but one cannot save it a text on the chart in excel. you have to save it in a cell. This is what i have read from the internet. Is this true that stats cant be added as txt on the chart?

Hi Simon,

That’s great to hear you have everything working.

Yes, I believe excel’s charting features are fairly limited out of the box so its my understanding also is that you cannot add text to the chart directly.

You could however save those statistics as text in a cell and then display that cell’s content on the chart as a data label

I didnt think about that. Thats a good idea. I will see if i can get that to work.

I am confused. when i run my code i get no errors so i assume everything is fine. the scatterchart shows up in the window correct and then i open the png file and its blank. Have i made an error in my code.

import pandas as pd
import matplotlib.pyplot as plt
import openpyxl
import scipy.stats as stats

df = pd.read_excel(“yield vs CWP_original.xlsx”, skiprows=[1], usecols=[1, 4])

print(df)

x = df[“CWP”]
y = df[“YIELD”]

Perform linear regression

slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)
r_squared = r_value ** 2

equation_text = f"Y = {slope:.2f}X + {intercept:.2f}"
r_squared_text = f"R-squared = {r_squared:.4f}"

plt.scatter(x, y, marker=‘o’, label=‘Data Points’, color=‘blue’)
plt.plot(x, slope * x + intercept, color=‘red’, label=‘Regression Line’)
plt.xlabel(‘CWP’)
plt.ylabel(‘YIELD’)
plt.text(0.05, 0.95, equation_text, transform=plt.gca().transAxes,
fontsize=12, verticalalignment=‘top’)
plt.text(0.05, 0.88, r_squared_text, transform=plt.gca().transAxes,
fontsize=12, verticalalignment=‘top’)
plt.show()

Save the plot as an image

plot_image_path = “YIELD VS CWP.png”
plt.savefig(plot_image_path)

close the plot

plt.close()


i got it working again. I think something was wrong when i tried adding statistics in.