在 Debian 上使用 Rust 进行数据科学计算
一 环境准备
二 常用库与用途
三 快速上手示例
新建项目:cargo new data_analysis && cd data_analysis
编辑 Cargo.toml: [dependencies] polars = “0.36” ndarray = “0.16” ndarray-stats = “0.5” rand = “0.8” ndarray-rand = “0.15” noisy_float = “0.2”
示例代码 src/main.rs(生成数据、计算均值与直方图): use ndarray::{Array2, Axis}; use ndarray_rand::{rand_distr::{StandardNormal, Uniform}, RandomExt}; use ndarray_stats::{HistogramExt, histogram::{strategies::Sqrt, GridBuilder}}; use noisy_float::types::{N64, n64}; use polars::prelude::*;
fn main() -> Result<(), PolarsError> { // 1) Polars: 构建 DataFrame 并计算均值 let df = DataFrame::new(vec![ Series::new(“x”, &[1, 2, 3, 4, 5]), Series::new(“y”, &[2.0, 4.0, 6.0, 8.0, 10.0]), ])?; let mean_y: f64 = df.column(“y”)?.f64()?.mean().unwrap(); println!(“Mean of y: {}”, mean_y);
// 2) ndarray + ndarray-stats: 生成数据并绘制直方图(文本)
let data: Array2<f64> = Array2::random((10_000, 2), StandardNormal);
let data_n64 = data.mapv(|x| n64(x));
let grid = GridBuilder::<Sqrt<N64>>::from_array(&data_n64).unwrap().build();
let hist = data_n64.histogram(grid);
println!("Histogram counts: {:?}", hist.counts());
Ok(())
}
运行:cargo run
新建项目:cargo new ml_linreg && cd ml_linreg
编辑 Cargo.toml: [dependencies] linfa = “0.6” ndarray = “0.15”
示例代码 src/main.rs: use linfa::prelude::*; use ndarray::array;
fn main() { let x = array![[1.0, 2.0], [2.0, 3.0], [3.0, 4.0], [4.0, 5.0], [5.0, 6.0]]; let y = array![3.0, 5.0, 7.0, 9.0, 11.0]; let model = linfa::linear_regression::LinearRegression::default(); let fitted = model.fit(&x, &y).unwrap(); let preds = fitted.predict(&x); println!(“Predictions: {:?}”, preds); }
运行:cargo run。
四 性能优化与 GPU 加速
五 部署与交付