losing-end-of-row values when manipulating CSV input

Neil Berg · Jul 13, 2011

Hello all,

I am having an issue with my attempts to accurately filter some data from a CSV file I am importing. I have attached both a sample of the CSV data and my script.

The attached CSV file contains two rows and 27 columns of data. The first column is the station ID "BLS", the second column is the sensor number "4", the third column is the date, and the remaining 24 columns are hourly temperature readings.

In my attached script, I read in row[3:] to extract just the temperatures, do a sanity check to make sure there are 24 values, remove any missing or "m" values, and then append the non-missing values into the "hour_list".

Strangely the the first seven rows appear to be empty after reading into the CSV file, so that's what I had to incorporate the if len(temps) ==24 statement.

But the real issue is that for days with no missing values, for example the second row of data, the length of the hour_list should be 24. My script, however, is returning 23. I think this is because the end-of-row-values have a trailing "\". This must mark these numbers as non-digits and are lost in my "isdig" filter line. I've tried several ways to remove this trailing "\", but to no success.

Do you have any suggestions on how to fix this issue?

Many thanks in advance,

Neil Berg

Marco Nawijn · Jul 13, 2011

Hello all,

I am having an issue with my attempts to accurately filter some data froma CSV file I am importing. I have attached both a sample of the CSV data and my script.

The attached CSV file contains two rows and 27 columns of data. The first column is the station ID "BLS", the second column is the sensor number "4", the third column is the date, and the remaining 24 columns are hourly temperature readings.

In my attached script, I read in row[3:] to extract just the temperatures, do a sanity check to make sure there are 24 values, remove any missing or"m" values, and then append the non-missing values into the "hour_list".

Strangely the the first seven rows appear to be empty after reading into the CSV file, so that's what I had to incorporate the if len(temps) == 24 statement.

But the real issue is that for days with no missing values, for example the second row of data, the length of the hour_list should be 24. My script, however, is returning 23. I think this is because the end-of-row-values have a trailing "\". This must mark these numbers as non-digits and arelost in my "isdig" filter line. I've tried several ways to remove this trailing "\", but to no success.

Do you have any suggestions on how to fix this issue?

Many thanks in advance,

Neil Berg

csv_test.py
1KViewDownload

csv_sample.csv
< 1KViewDownload

Hello Neil,

I just had a quick look at your script. To remove the trailing "\" you
can use val = val.rstrip('\\') in your script. Note the double
backslash.

The script now returns 24 items in the hour_list.

Good luck!

Marco

Marco Nawijn · Jul 13, 2011

Hello all,

I am having an issue with my attempts to accurately filter some data froma CSV file I am importing. I have attached both a sample of the CSV data and my script.

The attached CSV file contains two rows and 27 columns of data. The first column is the station ID "BLS", the second column is the sensor number "4", the third column is the date, and the remaining 24 columns are hourly temperature readings.

In my attached script, I read in row[3:] to extract just the temperatures, do a sanity check to make sure there are 24 values, remove any missing or"m" values, and then append the non-missing values into the "hour_list".

Strangely the the first seven rows appear to be empty after reading into the CSV file, so that's what I had to incorporate the if len(temps) == 24 statement.

But the real issue is that for days with no missing values, for example the second row of data, the length of the hour_list should be 24. My script, however, is returning 23. I think this is because the end-of-row-values have a trailing "\". This must mark these numbers as non-digits and arelost in my "isdig" filter line. I've tried several ways to remove this trailing "\", but to no success.

Do you have any suggestions on how to fix this issue?

Many thanks in advance,

Neil Berg

csv_test.py
1KViewDownload

csv_sample.csv
< 1KViewDownload

Dear Neil,

Don't know if this is a double post (previous post seems to be gone),
but val = val.rstrip('\\') should fix your problem. Note the double
backslash.

Kind regards,

Marco

Nan values after merging 2 dataframes	1	Apr 19, 2023
Duplicate ID issue when trying to copy table row	1	Jul 30, 2022
ValueError - "Found input variables with inconsistent numbers of samples: [100, 120]"	1	Jul 27, 2023
What's the best way to extract 2 values from a CSV file from each row systematically?	6	Sep 23, 2013
How do I use Find and Loop in VBA for Excel to identify, delete, and insert blank row for values greater than 6?	0	Feb 28, 2022
parsing CSV file	0	Mar 18, 2009
CSV confusion newbie question	1	Dec 6, 2009
Help with my responsive home page	2	Dec 14, 2022

losing-end-of-row values when manipulating CSV input

Neil Berg

Marco Nawijn

Marco Nawijn

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads