Hardware Optimization
System Kernel & Configuration Tuning
ulimit -n 65535) to prevent “Too many open files” errors. Permanently set this in /etc/security/limits.conf (add * soft nofile 65535; * hard nofile 65535) and /etc/pam.d/login (add session required pam_limits.so)./etc/sysctl.conf to improve network performance:net.ipv4.tcp_tw_reuse = 1 # Reuse TIME_WAIT sockets
net.core.somaxconn = 65535 # Increase connection queue length
net.ipv4.ip_local_port_range = 1024 65535 # Expand ephemeral port range
Apply changes with sysctl -p./sys/block/sdX/queue/read_ahead_kb) to enhance sequential read performance for large files.noatime,nodiratime to /etc/fstab for HDFS partitions to reduce file system overhead from tracking access times.HDFS Configuration Adjustments
dfs.block.size (in hdfs-site.xml) based on workload:
dfs.replication (default: 3) based on data criticality—reduce to 2 for non-critical data to save storage and improve write performance, or increase to 4 for high-availability workloads.dfs.client.read.shortcircuit to true in hdfs-site.xml to allow clients to read data directly from local DataNodes (bypassing RPC), reducing network latency.dfs.namenode.handler.count (e.g., 20–50) and dfs.datanode.handler.count (e.g., 30–100) in hdfs-site.xml to handle more concurrent requests, improving throughput for multi-threaded applications.dfs.datanode.data.dir (e.g., /data1/dn,/data2/dn) to distribute data storage across disks, reducing I/O bottlenecks.Data Management Best Practices
dfs.datanode.data.dir compression) and in transit (e.g., mapreduce.map.output.compress=true with SnappyCodec). Snappy is recommended for its low CPU overhead and good compression ratio.Cluster Scaling & Monitoring
TestDFSIO (for read/write throughput) and NNBench (for NameNode operations). Analyze results to identify bottlenecks (e.g., high disk latency, network saturation).