I have a very large .csv file with a numerical matrix saved within. I need to calculate the mean of many selections of values in each row. (e.g. In each row, the mean of the values at index 1, 3, 52, 123; then, in the same line, values 2, 3, 12, 29, 67, etc...)

My file is HUGE, even in row length (a row has like, 8000+ items), so I need this to be fast. I know the indexes I need to average at the start of the computation, but don't have enough memory to load the whole file at once.

@MrHedmad If you are comfortable writing SQL then #duckdb might be useful: https://duckdb.org/docs/data/csv/overview.html and if you are an #rstats person then you can query duckdb with #dplyr, too.
CSV Import

Examples The following examples use the flights.csv file. Read a CSV file from disk, auto-infer options: SELECT * FROM 'flights.csv'; Use the read_csv function with custom options: SELECT * FROM read_csv('flights.csv', delim = '|', header = true, columns = { 'FlightDate': 'DATE', 'UniqueCarrier': 'VARCHAR', 'OriginCityName': 'VARCHAR', 'DestCityName': 'VARCHAR' }); Read a CSV from stdin, auto-infer options: cat flights.csv | duckdb -c "SELECT * FROM read_csv('/dev/stdin')" Read a CSV file into a table: CREATE TABLE ontime ( FlightDate DATE, UniqueCarrier VARCHAR, OriginCityName VARCHAR, DestCityName VARCHAR ); COPY ontime FROM 'flights.csv'; Alternatively, create a table without specifying the schema manually using a…

DuckDB
@thomas_sandmann
This looks interesting, and I'm OK with sql. But at a glance I didn't get it out needs to load the whole file in memory at once or not...
@MrHedmad duckdb tables can be stored in memory (less helpful) or be written to disk, eg be pointing the duckdb::duckdb() rstats function to a directory. Then you are only limited by your available disk space, not RAM. This example might be helpful: https://bwlewis.github.io/duckdb_and_r/taxi/taxi.html
taxi.utf8

@MrHedmad And if you prefer parquet files as a starting point, then duckdb can do the conversion as well: https://rmoff.net/2023/03/14/quickly-convert-csv-to-parquet-with-duckdb/ without reading the full file into memory.
Quickly Convert CSV to Parquet with DuckDB

Quickly Convert CSV to Parquet with DuckDB