Skip to content
HN On Hacker News ↗

3.4M Solar Panels

▲ 320 points 273 comments by marklit 4w ago HN discussion ↗

Pangram verdict · v3.3

We believe that this document is fully human-written

7 %

AI likelihood · overall

Human
100% human-written 0% AI-generated
SEGMENTS · HUMAN 8 of 8
SEGMENTS · AI 0 of 8
WORD COUNT 1,147
PEAK AI % 13% · §5
Analyzed
Apr 22
backend: pangram/v3.3
Segments scanned
8 windows
avg 143 words each
Distribution
100 / 0%
human / AI fraction
Verdict
Human
Pangram v3.3

Article text · 1,147 words · 8 segments analyzed

Human AI-generated
§1 Human · 1%

In October, I reviewed the Ground-Mounted Solar Energy in the United States (GM-SEUS) dataset. This dataset attempted to outline the majority of solar farm arrays and panels across the US. Version 1 of this dataset contained 2.9M panels.On Monday, version 2 of this dataset was released and now contains more than 3.4M panels. In addition to the panels and arrays being refreshed, there is a new rooftop array dataset.In this post, I'll review v2 of the GM-SEUS dataset.My WorkstationI'm using a 5.7 GHz AMD Ryzen 9 9950X CPU. It has 16 cores and 32 threads and 1.2 MB of L1, 16 MB of L2 and 64 MB of L3 cache. It has a liquid cooler attached and is housed in a spacious, full-sized Cooler Master HAF 700 computer case.The system has 96 GB of DDR5 RAM clocked at 4,800 MT/s and a 5th-generation, Crucial T700 4 TB NVMe M.2 SSD which can read at speeds up to 12,400 MB/s. There is a heatsink on the SSD to help keep its temperature down. This is my system's C drive.The system is powered by a 1,200-watt, fully modular Corsair Power Supply and is sat on an ASRock X870E Nova 90 Motherboard.I'm running Ubuntu 24 LTS via Microsoft's Ubuntu for Windows on Windows 11 Pro. In case you're wondering why I don't run a Linux-based desktop as my primary work environment, I'm still using an Nvidia GTX 1080 GPU which has better driver support on Windows and ArcGIS Pro only supports Windows natively.

§2 Human · 1%

Installing PrerequisitesI'll use GDAL 3.9.3 to help analyse the data in this post.$ sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable $ sudo apt update $ sudo apt install \ gdal-bin I'll use DuckDB, along with its H3, JSON, Lindel, Parquet and Spatial extensions in this post.$ cd ~ $ wget -c https://github.com/duckdb/duckdb/releases/download/v1.5.1/duckdb_cli-linux-amd64.zip $ unzip -j duckdb_cli-linux-amd64.zip $ chmod +x duckdb $ ~/duckdb INSTALL h3 FROM community; INSTALL lindel FROM community; INSTALL json; INSTALL parquet; INSTALL spatial; I'll set up DuckDB to load every installed extension each time it launches.$ vi ~/.duckdbrc .timer on .width 180 LOAD h3; LOAD lindel; LOAD json; LOAD parquet; LOAD spatial; The maps in this post were rendered with QGIS version 4.0.1. QGIS is a desktop application that runs on Windows, macOS and Linux. The application has grown in popularity in recent years and has ~15M application launches from users all around the world each month.I used QGIS' HCMGIS plugin to add basemaps from Esri to this post.Analysis-Ready DatasetsThe following will download a 3.4 GB ZIP file. I'll extract any GeoPackage (GPKG) file from it.$ wget -O GMSEUS_v2.zip \ 'https://zenodo.org/records/19581821/files/GMSEUS.zip?download=1' $ unzip -j GMSEUS_v2.zip "*.gpkg" The following is the projection used by the GPKG files in this dataset.$

§3 Human · 2%

gdalsrsinfo \ -o proj4 \ GMSEUS_RooftopArrays_2025_v2_0.gpkg +proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs The following will convert the rooftop arrays into Parquet. I needed to use DuckDB v1.4.4 for the following as v1.5.1 was raising exceptions.$ ~/duckdb COPY ( WITH a AS ( SELECT Source, grndCvr, modType, mount, nativeID, roofArrID, area: IF(area::TEXT='-9999.0', NULL, area::DOUBLE), azimuth: IF(azimuth::TEXT='-9999.0', NULL, azimuth::DOUBLE), capMWAC: IF(capMWAC::TEXT='-9999.0', NULL, capMWAC::DOUBLE), capMWDC: IF(capMWDC::TEXT='-9999.0', NULL, capMWDC::DOUBLE), tilt: IF(tilt::TEXT='-9999.0', NULL, tilt::DOUBLE), instYr: CASE WHEN instYr::INT = -9999 THEN NULL ELSE instYr END, ST_FLIPCOORDINATES( ST_TRANSFORM( geom, '+proj=aea +lat_0=23 +lon_0=-96

§4 Human · 4%

+lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs', 'EPSG:4326')) geometry FROM ST_READ('GMSEUS_RooftopArrays_2025_v2_0.gpkg') ) SELECT * EXCLUDE (geometry), {'xmin': ST_XMIN(ST_EXTENT(geometry)), 'ymin': ST_YMIN(ST_EXTENT(geometry)), 'xmax': ST_XMAX(ST_EXTENT(geometry)), 'ymax': ST_YMAX(ST_EXTENT(geometry))} AS bbox, ST_ASWKB(geometry) geometry FROM a ORDER BY HILBERT_ENCODE([ST_Y(ST_CENTROID(geometry)), ST_X(ST_CENTROID(geometry))]::double[2]) ) TO 'GMSEUS_RooftopArrays_2025_v2_0.parquet' ( FORMAT 'PARQUET', CODEC 'ZSTD', COMPRESSION_LEVEL 22, ROW_GROUP_SIZE 15000); There are 5,822 records in this dataset.SELECT COUNT(*) FROM 'GMSEUS_RooftopArrays_2025_v2_0.parquet'; 5,822 Below is a breakdown of unique values and NULL coverage across each column.SELECT column_name, column_type, null_percentage, approx_unique, min, max FROM (SUMMARIZE FROM READ_PARQUET('GMSEUS_RooftopArrays_2025_v2_0.parquet')) WHERE column_name != 'geometry' AND column_name != 'bbox' ORDER BY LOWER(column_name); ┌─────────────┬─────────────┬─────────────────┬───────────────┬────────────┬────────────────────┐ │ column_name │ column_type │ null_percentage │ approx_unique │ min │

§5 Human · 13%

max │ │ varchar │ varchar │ decimal(9,2) │ int64 │ varchar │ varchar │ ├─────────────┼─────────────┼─────────────────┼───────────────┼────────────┼────────────────────┤ │ area │ DOUBLE │ 2.77 │ 5180 │ 15.0 │ 487111.0 │ │ azimuth │ DOUBLE │ 89.63 │ 156 │ 0.0 │ 530.02323408881 │ │ capMWAC │ DOUBLE │ 89.52 │ 60 │ 0.2 │ 74.9 │ │ capMWDC │ DOUBLE │ 87.12 │ 166 │ 0.00448 │ 99.7 │ │ grndCvr │ VARCHAR │ 97.61 │ 2 │ impervious │ vegetation │ │ instYr │ BIGINT │ 72.43 │ 23 │ 2003 │ 2025 │ │ modType │ VARCHAR │ 0.00 │ 2 │ c-si │ thin-film │ │ mount │ VARCHAR │ 87.53 │ 5 │ dual_axis │ unknown │ │ nativeID │ VARCHAR │ 0.00 │ 4540 │ 1 │ Xebec 1 solar farm │ │ roofArrID │ BIGINT │ 0.00 │ 5830 │ 1 │ 5822 │ │ Source │ VARCHAR │ 0.00 │

§6 Human · 3%

15 │ CCVPV │ gspt │ │ tilt │ DOUBLE │ 90.64 │ 31 │ 0.0 │ 52.0 │ ├─────────────┴─────────────┴─────────────────┴───────────────┴────────────┴────────────────────┤ │ 12 rows 6 columns │ └───────────────────────────────────────────────────────────────────────────────────────────────┘ The following will convert the panels into Parquet.COPY ( WITH a AS ( SELECT Source, arrayID: arrayID::INT, panelID: panelID::INT, pnlSource, rowArea, rowAzimuth, rowLength, rowMount, rowSpace: IF(rowSpace::TEXT='-9999.0', NULL, rowSpace::DOUBLE), rowWidth, ST_FLIPCOORDINATES( ST_TRANSFORM( geom, '+proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs', 'EPSG:4326')) geometry FROM ST_READ('GMSEUS_Panels_Final_2025_v2_0.gpkg') ) SELECT * EXCLUDE (geometry), {'xmin': ST_XMIN(ST_EXTENT(geometry)), 'ymin': ST_YMIN(ST_EXTENT(geometry)), 'xmax': ST_XMAX(ST_EXTENT(geometry)), 'ymax': ST_YMAX(ST_EXTENT(geometry))} AS bbox, ST_ASWKB(geometry) geometry

§7 Human · 3%

FROM a ORDER BY HILBERT_ENCODE([ST_Y(ST_CENTROID(geometry)), ST_X(ST_CENTROID(geometry))]::double[2]) ) TO 'GMSEUS_Panels_Final_2025_v2_0.parquet' ( FORMAT 'PARQUET', CODEC 'ZSTD', COMPRESSION_LEVEL 22, ROW_GROUP_SIZE 15000); There are 3,429,157 records in this dataset.SELECT COUNT(*) FROM 'GMSEUS_Panels_Final_2025_v2_0.parquet'; 3,429,157 Below is a breakdown of unique values and NULL coverage across each column.SELECT column_name, column_type, null_percentage, approx_unique, min, max FROM (SUMMARIZE FROM READ_PARQUET('GMSEUS_Panels_Final_2025_v2_0.parquet')) WHERE column_name != 'geometry' AND column_name != 'bbox' ORDER BY LOWER(column_name); ┌─────────────┬─────────────┬─────────────────┬───────────────┬───────────┬─────────────┐ │ column_name │ column_type │ null_percentage │ approx_unique │ min │ max │ │ varchar │ varchar │ decimal(9,2) │ int64 │ varchar │ varchar │ ├─────────────┼─────────────┼─────────────────┼───────────────┼───────────┼─────────────┤ │ arrayID │ INTEGER │ 0.03 │ 12653 │ 1 │ 18980 │ │ panelID │ INTEGER │ 0.00 │ 3323765 │ 1 │ 3429157 │ │ pnlSource │ VARCHAR

§8 Human · 6%

│ 0.00 │ 5 │ CCVPV │ OSM │ │ rowArea │ DOUBLE │ 0.00 │ 100105 │ 15.01 │ 9982.68 │ │ rowAzimuth │ DOUBLE │ 0.00 │ 22029 │ 90.0 │ 540.0 │ │ rowLength │ DOUBLE │ 0.00 │ 25531 │ 3.96 │ 737.38 │ │ rowMount │ VARCHAR │ 0.00 │ 3 │ dual_axis │ single_axis │ │ rowSpace │ DOUBLE │ 1.27 │ 1836 │ 0.01 │ 20.0 │ │ rowWidth │ DOUBLE │ 0.00 │ 2258 │ 0.45 │ 135.33 │ │ Source │ VARCHAR │ 0.00 │ 12 │ CCVPV │ USPVDB │ ├─────────────┴─────────────┴─────────────────┴───────────────┴───────────┴─────────────┤ │ 10 rows 6 columns │ └───────────────────────────────────────────────────────────────────────────────────────┘ The following will convert the arrays and panels datasets into Parquet.COPY ( WITH a AS ( SELECT COUNTYFP, GCR1, GCR2, STATEFP, Source, arrayID, avgAzimuth: IF(avgAzimuth::TEXT='-9999.0', NULL, avgAzimuth::DOUBLE), avgLength: