1. 程式人生 > >Getting Started With SpaceNet Data

Getting Started With SpaceNet Data

Getting Started With SpaceNet Data

The first SpaceNet challenge is complete, but the data remains available for download and analysis on AWS. This dataset contains a massive amount of labeled data in GeoJSON files, a format that may be unfamiliar to many in the computer vision field. This post aims to lower the barrier of entry for exploring SpaceNet data by demonstrating methods to transform and visualize the raw SpaceNet GeoJSON labels into formats more conducive for machine learning, namely NumPy arrays and image masks. Further motivating the study of SpaceNet data is the release of a new

SpaceNet point of interest dataset. We include python code for the interested reader, and refer the reader to the SpaceNet Challenge repository for more utilities.

  • December 2017 update: updated code is also available here.

1. Data Access

After creating an AWS account, download the data at the SpaceNet AWS portal

. Detailed descriptions of data formats and download instructions can be found here. In short, the command to download processed 200m x 200m image tiles with associated building footprints is:

aws s3api get-object --bucket spacenet-dataset \
--key AOI_1_Rio/processedData/processedBuildingLabels.tar.gz \
--request-payer requester processedBuildingLabels.tar.gz

For this post, we will focus on the TopCoder challenge dataset. Upon downloading and expanding the tarballs, the TopCoder training directory structure should appear as follows:

Figure 1. SpaceNet TopCoder data directory

In this post we will focus on the high-resolution 3-band imagery as well as the vector data.

2. Data Inspection

Image cutouts for the pan-sharpened 3-band imagery are 438–439 pixels in width, and 406–407 pixels in height. 8-band images have not been pan-sharpened and so have 1/4 the resolution of the 3-band imagery at 110 x 102 pixels. For each unique image ID we find a corresponding entry in the vectordata/geoJson directory with image footprints.

Figure 2. Random image from the SpaceNet training dataset (3band_013022223130_Public_img124.tif).
Figure 3. First entry of the GeoJSON label file associated with Figure 2. Here we show the first building label associated with the image; note that coordinates are stored as a WKT polygon or multipolygon with coordinates stored as [longitude, latitude, elevation]. The elevation field is always zero for this dataset.

2. Ground Truth Transform

Computer vision algorithms tend to operate in pixel space, where locations are reported on the matrix of pixel positions rather than latitude and longitude. After the initial data download, or extraction, the second step in the extract-transform-load (ETL) process is to transform the latitude-longitude coordinates in the GeoJSON label files to pixel coordinates. We describe three methods of transforming the GeoJSON label files into pixel coordinates in various formats.

2.1 Building Outline Coordinates

The GeoJSON file lists building polygon vertices in latitude and longitude. Transforming these vertices into pixel coordinates requires knowledge of the image extent and precise geometric coordinate transform. This information (along with much more) can be extracted with the GDAL code suite. A number of sophisticated functions using GDAL and other geospatial libraries are available in the SpaceNet utilities repository on GitHub. The code below takes the GeoJSON label file and corresponding image and returns two coordinate arrays, one in geospatial coordinates (latitude and longitude) and one in pixel coordinates.

Code snippet 1. Function to transform GeoJSON label files to an array of coordinates (both lat,lon and pixel).

We can inspect our transform by overlaying the ground truth polygons on the input image using matplotlib.

Code snippet 2. Function to plot the truth coordinates for an input image.

相關推薦

Getting Started With SpaceNet Data

Getting Started With SpaceNet DataThe first SpaceNet challenge is complete, but the data remains available for download and analysis on AWS. This dataset c

[2] Getting Started With Data Reflections

Getting Started With Data Reflections Why Data Reflections? 分析中通常涉及較大資料集和資源密集型的操作,資料分析和資料科學家需要較高效的互動式查詢來完成他們的分析工作,其中分析任務多是迭代關聯性的,每一

LLVM每日談之十九 LLVM的第一本系統的書<Getting Started with LLVM Core Libraries>

關於 日本 簡單的 lvm 作者 普通 lan 最好 裏的 作者:史寧寧(snsn1984)LLVM最終有了一本系統的書了——《Getting Started with LLVM Core Libraries》。這本書號稱是LLVM的第一本書,可是據說日本早就有兩本日文的

Getting started with Kentico

sbo short conf doc body his learn cati site https://docs.kentico.com/k10tutorial https://docs.kentico.com/k10tutorial/getting-started

[原創]Getting Started with Skywalking

-c java word nta rec compress tar mbed already Getting Started with Skywalking Pre JDK 1.8+ Skywalking(v3.2.6) (All packages can

Getting started with docker - 1.Orientation and setup

Get Started, Part 1: Orientation and setup Get Started, Part 1: Orientation and setup Docker concepts Images and cont

Getting Started with Processing 第四章總結

為什麼要使用變數: 我們使用變數的一個重要原因就是避免變成過程中的重複工作,如果你重複使用某一個數字超過了一次,就可以考慮使用一個變數來代替它,這樣你的程式會更加通用並且易於更新。 定義變數 定義變數的時候,要確定其變數名(name),資料型別(data type) 和變數值 value.在 Proce

Getting Started with Processing 第五章的easing問題

分析 使用 easing easing 的感官目的是為了 draw 的時候,畫的圖形不是即時 mouseX 的值,而是稍有落後一點。從演算法分析,就是讓所畫圖形的 x 座標 落後於 mouseX 的值,並且朝 mouseX 的方向進行運動。程式如下: float x; float easing =

Getting Started with Processing 第五章的easing問題(2)

上一個 第五章 RoCE mouse process 一次 成了 參數 二維 程序代碼清單如下: float x; float y; float px; float py; float easing = 0.05; void setup(){ size(480,120)

Getting Started with Processing 第五章的總結

Getting Started with Processing 第五章:響應 一次與永久 setup()函式 Processing 中,setup()函式只執行一次,用於設定一些初始的值,比如畫布的大小,還有填充和線條粗細,顏色的程式碼。第一行總是size(),接下啦是其他宣告。 draw()函式

Getting started with Processing 第七章總結

媒體 如何將檔案匯入 Processing 中 在 Processing 中,程式是通過應用 data 資料夾中的檔案來顯示的,這個資料夾可以通過選單欄中的 Sketch>show sketch folder(command+K),來顯示,可以通過兩個辦法新增檔案: 打卡data資料夾,向其

Getting Started with XlsxWriter

下面是一些關於使用XlsxWriter模組的簡單介紹。   安裝XlsxWriter 下面的是幾個安裝XlsxWriter模組的方法: 1、使用Pip 使用pip 方式是最推薦的從PyPi安裝Python模組的方法。 Python 安裝包索引: 2、使用 Easy_install

Getting Started with Processing 第十章——物件

不像原始資料型別boolean,int 和 float 只能存一個值,一個物件可以存很多值。但這也是我們講的一部分,物件也是用相關函式將變數編組的一種方式。 域和方法 在物件的上下文中,一個變數被叫做一個值域(field),一個函式被叫做一個方法(method)。值域和方法的工作原理與函式和變數一樣 類

Getting Started with Processing 第十章——對象

int oat ext get dom started 不返回 類定義 ole 不像原始數據類型boolean,int 和 float 只能存一個值,一個對象可以存很多值。但這也是我們講的一部分,對象也是用相關函數將變量編組的一種方式。 域和方法 在對象的上下文中,一個變量

Getting started with Processing 示例11-9 追隨鼠標移動

總結 數組 繼續 並且 隨機數 tar The get for 程序片段 int num = 60; int[] x = new int[num]; int[] y = new int[num]; void setup(){ size(240,120); noStrok

Getting started with Processing 示例11-9 追隨滑鼠移動

程式片段 int num = 60; int[] x = new int[num]; int[] y = new int[num]; void setup(){ size(240,120); noStroke(); } void draw(){ background(0); //from back

Getting started with Processing 第十一章——陣列

Getting started with Processing 第十一章——陣列 從變數到陣列: 使用陣列,無需為每一個變數建立一個新的名稱/這讓程式碼變得更短,更容易理解,更方便更新。 建立陣列的三個步驟 建立一個數組,需要經過三個步驟: 宣告陣列,定義資料型別。這裡的資料型別可以是任何型別

Getting started with Processing 第十三章——延伸(1)

匯入庫: 匯入庫的名稱為:import processing.libName.* 聲音 播放聲音 支援的格式:wav,aiff,mp3宣告: SoundFile blip;建立:blip = new SoundFile(this,"filename");物件可以使用的函式:loop() 和 play(

Getting started with the Zowe WebUi

Learning objectives This tutorial walks you through the process of adding new apps to the Zowe WebUi, and teaches you how to communic

Getting started with UX research

Getting started with UX researchIn this article we will look at the steps to take to conduct UX research, focusing particularly on usability studies.NOTE: