How to convert CSV to parquet file without RLE_DICTIONARY encoding?

rcmv · Sep 2, 2022

I've already test three ways of converting a csv file to a parquet file. You can find them below. All the three created the parquet file. I've tried to view the contents of the parquet file using "APACHE PARQUET VIEWER" on Windows and I always got the following error message:

"encoding RLE_DICTIONARY is not supported"

Is there any way to avoid this? Maybe a way to use another type of encoding?... Below the code:

1º Using pandas:

Python:

import pandas as pd
df = pd.read_csv("filename.csv")
df.to_parquet("filename.parquet")

2º Using pyarrow:

Python:

from pyarrow import csv, parquet
table = csv.read_csv("filename.csv")
parquet.write_table(table, "filename.parquet")

3º Using dask:

Python:

from dask.dataframe import read_csv
dask_df = read_csv("filename.csv", dtype={'column_xpto': 'float64'})
dask_df.to_parquet("filename.parquet")

KML to CSV file conversion using Python and Windows Powershell	0	Oct 14, 2022
Getting Error reading in JSON file	0	Apr 28, 2022
How to loop in folder through all excel files and all sheets using pandas?	0	Dec 1, 2022
How to put loop result in csv file	1	Jan 3, 2023
How to sort a CSV file with merge sort JAVA	7	May 6, 2021
Errors When Pulling Information from CSV File to Python	0	Dec 10, 2020
How can I upload a tar.bz2 file to OpenStack swift object storage container using the Python swift client?	2	Mar 22, 2024
How to Convert Apple Mail MBOX Files to Outlook MSG?	4	Oct 4, 2024

How to convert CSV to parquet file without RLE_DICTIONARY encoding?

rcmv

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads