After MultiIndex DataFrame object is created with additional information

Li.

Joined
Apr 4, 2023
Messages
2
Reaction score
0
I'm trying to write automated program to convert excel table to hierarchical graph.

I load my excel table and have such data:
Python:
self.df = pd.read_excel(self.file, sheet_name="Checklist", engine="openpyxl", header=[10])
 print(self.df)
Test case name Testing status 2023-01-01 Testing status 2023-01-02 Testing status 2023-02-20 Testing status 2023-03-15
0 SW password PASS FAILED FAILED PASS
1 Access levels PASS NOT TESTED PASS PASS
2 Local license server NOT TESTED NOT TESTED PASS PASS
3 High level security NOT TESTED PASS PASS PASS
4 Interruption in communication FAILED PASS PASS PASS
5 Writing parameters FAILED FAILED FAILED FAILED



Then I use pd.MultiIndex to group data and get result I want
Python:
index = pd.MultiIndex.from_frame(self.df)                 
print(index)
MultiIndex([( 'SW password', 'PASS', 'FAILED', ...),
( 'Access levels', 'PASS', 'NOT TESTED', ...),
( 'Local license server', 'NOT TESTED', 'NOT TESTED', ...),
( 'High level security', 'NOT TESTED', 'PASS', ...),
names=['Test case name', 'Testing status 2023-01-01', 'Testing status 2023-01-02', 'Testing status 2023-02-20', 'Testing status 2023-03-15'])

After this I create a DataFrame object and see that appears additional corrupted columns. How to fix it ?
Python:
self.dataFrame = pd.DataFrame(data=self.df, index=index)
print(self.dataFrame)

Test case name ... Testing status 2023-03-15
Test case name Testing status 2023-01-01 Testing status 2023-01-02 Testing status 2023-02-20 Testing status 2023-03-15 ...

SW password PASS FAILED FAILED PASS NaN ... NaN
Access levels PASS NOT TESTED PASS PASS NaN ... NaN
Local license server NOT TESTED NOT TESTED PASS PASS NaN ... NaN
High level security NOT TESTED PASS PASS PASS NaN ... NaN
Interruption in communication FAILED PASS PASS PASS NaN ... NaN
Writing parameters FAILED FAILED FAILED FAILED NaN ... NaN
[6 rows x 5 columns]
 

Attachments

  • 1.png
    1.png
    142.9 KB · Views: 5
  • 2.png
    2.png
    163.5 KB · Views: 6
Last edited:
Joined
Sep 4, 2022
Messages
128
Reaction score
16
Python:
Syntax: df.iloc [row index range, column index range]

look at this tuto :



as your excel sheet is long and with blank fields, you have to apply a constraint on the selected rows and columns to retrieve only the intersting fields.
 

Li.

Joined
Apr 4, 2023
Messages
2
Reaction score
0
Python:
Syntax: df.iloc [row index range, column index range]

look at this tuto :



as your excel sheet is long and with blank fields, you have to apply a constraint on the selected rows and columns to retrieve only the intersting fields.
Thanks. I was thinking about it, but how can I manage it if my table of data each time will have different length of columns and rows ? With "for ... in .... " in df.iloc row ?
I tried some command to filter to show only columns with values and don't show columns with NaN, but it not helped.
Python:
df.drop.nan()

df=df[df['str_field'].str.len() > 0]

With this code I still get corrupted data
Python:
self.dataFrame = pd.DataFrame(data=self.df, index=index)
self.dataFrame.loc[:,['Testing status' in i for i in self.dataFrame.columns]]
print(self.dataFrame)
 
Last edited:

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,058
Latest member
QQXCharlot

Latest Threads

Top