How to convert CSV to parquet file without RLE_DICTIONARY encoding?

Joined
Sep 2, 2022
Messages
1
Reaction score
1
I've already test three ways of converting a csv file to a parquet file. You can find them below. All the three created the parquet file. I've tried to view the contents of the parquet file using "APACHE PARQUET VIEWER" on Windows and I always got the following error message:

"encoding RLE_DICTIONARY is not supported"

Is there any way to avoid this? Maybe a way to use another type of encoding?... Below the code:

1º Using pandas:

Python:
import pandas as pd
df = pd.read_csv("filename.csv")
df.to_parquet("filename.parquet")

2º Using pyarrow:

Python:
from pyarrow import csv, parquet
table = csv.read_csv("filename.csv")
parquet.write_table(table, "filename.parquet")

3º Using dask:

Python:
from dask.dataframe import read_csv
dask_df = read_csv("filename.csv", dtype={'column_xpto': 'float64'})
dask_df.to_parquet("filename.parquet")
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top