How to convert CSV to parquet file without RLE_DICTIONARY encoding?


Joined
Sep 2, 2022
Messages
1
Reaction score
1
I've already test three ways of converting a csv file to a parquet file. You can find them below. All the three created the parquet file. I've tried to view the contents of the parquet file using "APACHE PARQUET VIEWER" on Windows and I always got the following error message:

"encoding RLE_DICTIONARY is not supported"

Is there any way to avoid this? Maybe a way to use another type of encoding?... Below the code:

1º Using pandas:

Python:
import pandas as pd
df = pd.read_csv("filename.csv")
df.to_parquet("filename.parquet")

2º Using pyarrow:

Python:
from pyarrow import csv, parquet
table = csv.read_csv("filename.csv")
parquet.write_table(table, "filename.parquet")

3º Using dask:

Python:
from dask.dataframe import read_csv
dask_df = read_csv("filename.csv", dtype={'column_xpto': 'float64'})
dask_df.to_parquet("filename.parquet")
 
Ad

Advertisements


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top