當前位置

首頁 > 英語閱讀 > 雙語新聞 > 大數據的最大挑戰來自氣候變化

大數據的最大挑戰來自氣候變化

推薦人: 來源: 閱讀: 1.14W 次

大數據的最大挑戰來自氣候變化

Global sea levels are about eight inches higher today than they were in 1880, and they are expected to rise another two to seven feet during this century. At the same time, some 5 million people in the U.S. live in 2.6 million coastal homes situated less than 4 feet above high tide.

你知道嗎,今天的全球海平面要比1880年的時候高出8英寸,而就在本世紀內,全球海平面預計還將上漲2到7英尺。另外,美國沿海地區有260萬戶家庭的500餘萬人口的住宅,在海水滿潮時,只高出海平面不到4英尺。

Do the math: Climate change is a problem, whatever its cause.

毫無疑問,氣候變化是個大問題,不管導致它的原因是什麼。

The problem? Actually making those complex calculations is an extremely challenging proposition. To understand the impact of climate change at the local level, you’ll need more than back-of-the-napkin mathematics.

那麼如何計算氣候對環境的影響呢?事實上,要進行這些複雜的計算,是一個極具挑戰性的課題。要想了解氣候變化對一國一地的影響水平,絕對不是在一張餐巾紙上寫寫畫畫就能算得出來的。

You’ll need big data technology.

這時你就需要大數據技術了。

Surging Seas is an interactive map and tool developed by the nonprofit Climate Central that shows in graphic detail the threats from sea-level rise and storm surges to all of the 3,000-plus coastal towns, cities, counties and states in the continental United States. With detail down to neighborhood scale—search for a specific location or zoom down as necessary—the tool matches areas with flooding risk timelines and provides links to fact sheets, data downloads, action plans, embeddable widgets, and other items.

“上升的海平面”(Surging Seas)是由非盈利組織“氣候中心”(Climate Central)開發的一款互動式地圖工具,它用圖形的形式詳細描繪了海平面上升和風暴潮給美國大陸沿海3000多個城市、城鎮和農村造成的威脅。它的細節可以精確到每一個街區——你可以搜索一個特定的地理位置,或是按照需要繼續縮小目標範圍。這個工具會與存在洪泛風險的地區進行匹配,並且提供相關實時報道、數據下載、行動計劃、內嵌小工具和其它相關事項的鏈接。

It’s the kind of number-crunching that was all but impossible only a few years ago.

這種數據處理方式僅僅在幾年前還是不可能實現的。

‘Just as powerful, just as big’

能力有多大,困難就多大

“Our strategy is to tell people about their climate locally in ways they can understand, and the only way to do that is with big data analysis,” said Richard Wiles, vice president for strategic communications and director of research with Climate Central. “Big data allows you to say simple, clear things.”

氣候中心的戰略溝通副總裁兼研究主任理查德o懷爾斯表示:“我們的戰略是以人們能夠理解的方式告訴他們當地的氣候情況,唯一能實現這個目標的方法就是通過大數據分析。大數據讓你能夠簡單、清晰地表達。”

There are actually two types of big data in use today to help understand and deal with climate change, Wiles said. The first is relatively recently collected data that is so voluminous and complex that it couldn’t be effectively manipulated before, such as NASA images of heat over cities, Wiles said. This kind of data “literally was too big to handle not that long ago,” he said, “but now you can handle it on a regular computer.”

懷爾斯指出,目前主要有兩種大數據形式可以用來幫助人們瞭解和應對氣候變化。第一類是某些在近期才收集到的數據,但它們往往數據量極大且非常複雜,擱在以前很難對其進行有效分析,比如美國國家航空航天局(NASA)對各大城市的熱成像繪圖。懷爾斯表示,這種數據“一直到不久之前,還因爲數據量過大而基本上沒法處理,但是現在你已經可以在一臺普通的電腦上處理它們了。”

The second type of big data is older datasets that may be less-than-reliable. This data “was always kind of there,” Wiles said, such as historic temperature trends in the United States. That kind of dataset is not overly complex, but it can be fraught with gaps and errors. “A guy in Oklahoma may have broken his thermometer back in 1936,” Wiles said, meaning that there could be no measurements at all for two months of that year.

第二類大數據是一些相對較老但可能不那麼可靠的數據。懷爾斯表示,這些數據“基本上一直都在那兒”,比如美國的歷史氣溫趨勢。這種數據一般不太複雜,但有可能存在不少缺口和誤差。比如懷爾斯就指出:“1936年,俄克拉荷馬州的某個負責量氣溫的傢伙有可能不小心把溫度計弄壞了。”這樣的話,當年可能就有兩個月根本沒有氣溫記錄。

Address those issues, and existing data can be “just as powerful, just as big,” Wiles said. “It makes it possible to make the story very local.”

懷爾斯表示,要解決這些問題,現有的數據可以說“能力有多大,困難就有多大。但是大數據技術使得揭示一城一地的氣候變化成爲可能。”

Climate Central imports data from historical government records to produce highly localized graphics for about 150 local TV weather forecasters across the U.S., illustrating climate change in each station’s particular area. For example, “Junes in Toledo are getting hotter,” Wiles said. “We use these data all the time to try to localize the climate change story so people can understand it.”

氣候中心從政府的歷史記錄中獲取原始數據,然後爲美國各地的150餘家地方電視臺的天氣預報節目製作高度本地化的氣候圖形,以闡釋該地區的氣候變化。比如懷爾斯指出:“今年六月,託雷多市變熱了。我們一直利用這些數據試圖讓當地人瞭解氣候變化趨勢。”

‘One million hours of computation’

100萬小時的計算

Though the Climate Central map is an effective tool for illustrating the problem of rising sea levels, big data technology is also helping researchers model, analyze, and predict the effects of climate change.

氣候中心的地圖是闡釋海平面上升情況的一個非常有效的工具。此外,大數據技術還能幫助研究人員模擬、分析和預測氣候變化的影響。

“Our goal is to turbo-charge the best science on massive data to create novel insights and drive action,” said Rebecca Moore, engineering manager for Google Earth Engine. Google Earth Engine aims to bring together the world’s satellite imagery—trillions of scientific measurements dating back almost 40 years—and make it available online along with tools for researchers.

谷歌地圖引擎(Google Earth Engine)的工程經理瑞貝卡o摩爾介紹道:“我們的目標是助力最好的大數據分析技術,以催生新穎的見解並且促進行動。”谷歌地圖旨在將全球的衛星圖像進行彙總,其中還包括40年來數以萬億計的觀測數據,並將其與其它爲研究人員開發的工具一道放在網上。

Global deforestation, for example, “is a significant contributor to climate change, and until recently you could not find a detailed current map of the state of the world’s forests anywhere,” Moore said. That changed last November when Science magazine published the first high-resolution maps of global forest change from 2000 to 2012, powered by Google Earth Engine.

比如在全球荒漠化問題上,摩爾表示:“全球荒漠化是氣候變化的一個重要推手,直到不久之前,還沒有一份詳細的實時地圖能夠顯示全球各地的森林情況。但現在情況不同了,去年11月,《科學》(Science)雜誌在谷歌地圖引擎的幫助下,發佈了首張2000至2012年的高分辨率全球森林變化圖。

“We ran forest-mapping algorithms developed by Professor Matt Hansen of University of Maryland on almost 700,000 Landsat satellite images—a total of 20 trillion pixels,” she said. “It required more than one million hours of computation, but because we ran the analysis on 10,000 computers in parallel, Earth Engine was able to produce the results in a matter of days.”

摩爾介紹道:“我們運行的森林測繪算法是由馬里蘭大學(University of Maryland)的馬特o漢森教授開發的,總共利用了70萬張美國陸地資源衛星的圖像,加起來大約有20萬億個像素點。它需要超過100萬小時的計算時間,但由於我們是在10,000臺計算機上並行計算的,因此谷歌地球引擎才得以在幾天內就得出了結果。

On a single computer, that analysis would have taken more than 15 years. Anyone in the world can view the resulting interactive global map on a PC or mobile device.

如果只用一臺計算機計算的話,完成這樣一次分析大概需要超過15年的時間。但現在全球各地的任何人都可以在電腦或移動設備上查看這次分析得到的這張互動式全球地圖。

‘We have sensors everywhere’

傳感器無所不在

Rapidly propelling such developments, meanwhile, is the fact that data is being collected today on a larger scale than ever before.

在這些項目取得快速進展的背後離不開這樣一個事實:如今我們對數據的收集程度已經遠超以往任何時候。

“Big data in climate first means that we have sensors everywhere: in space, looking down via remote sensing satellites, and on the ground,” said Kirk Borne, a data scientist and professor at George Mason University. Those sensors are continually recording information about weather, land use, vegetation, oceans, ice cover, precipitation, drought, water quality, and many more variables, he said. They are also tracking correlations between datasets: biodiversity changes, invasive species, and at-risk species, for example.

喬治梅森大學的數據學家柯克o波恩教授指出:“大數據技術在氣候研究領域的發展,首先意味着傳感器已經無所不在。首先是太空中的遙感衛星,其次是地面上的傳感器。”這些傳感器時刻記錄着地球各地的天氣、土地利用、植被、海洋、冰層、降水、乾旱、水質等信息以及許多變量。同時它們也在跟蹤各種數據之間的關聯,比如生物多樣性的變化、入侵物種和瀕危物種等等。

Two large monitoring projects of this kind are NEON—the National Ecological Observatory Network—andOOI, the Ocean Observatories Initiative.

在這一類監控項目中有兩個比較有代表性的大型項目,一個是美國國家生態觀測站網絡(NEON),一個是海洋觀測計劃(OOI)。

“All of these sensors also deliver a vast increase in the rate and the number of climate-related parameters that we are now measuring, monitoring, and tracking,” Borne said. “These data give us increasingly deeper and broader coverage of climate change, both temporally and geospatially.”

波恩指出:“這些傳感器令我們現在正在觀測和追蹤的氣候參數無論在等級還是數量上都有了極大的提高。另外無論是在時間上還是在地理空間上,這些數據對氣候變化的覆蓋都變得越來越深、越來越廣。”

Climate change is one of the largest examples of scientific modeling and simulation, Borne said. Efforts are focused not on tomorrow’s weather but on decades and centuries into the future.

波恩表示,氣候變化是科學建模仿真應用得最廣泛的例子之一。科學家不僅利用建模仿真來預測明天的天氣,而且還用它來預測幾十年甚至幾百年後的氣候。

“Huge climate simulations are now run daily, if not more frequently,” he said. These simulations have increasingly higher horizontal spatial resolution—hundreds of kilometers, versus tens of kilometers in older simulations; higher vertical resolution, referring to the number of atmospheric layers that can be modeled; and higher temporal resolution—zeroing in on minutes or hours as opposed to days or weeks, he added.

他還表示:“大規模的氣候模擬現在每天都在運行,有些甚至可能更爲頻繁。”這些模擬的水平分辨率更高,達到幾百公里,而過去的模擬只能達到幾十公里。同時它們垂直分辨率也變得更高,這也就表示可以對大氣層中更多的層進行建模。另外還有更高的瞬時分辨率,也就是說只需要幾分鐘或幾個小時就可以進行歸零校正,而不是幾天或幾個星期。

The output of each daily simulation amounts to petabytes of data and requires an assortment of tools for storing, processing, analyzing, visualizing, and mining.

每天的氣候模擬都會生成幾千兆字節的數據,並且需要一系列工具進行存儲、處理、分析、挖掘和圖像化。

‘All models are wrong, but some are useful’

所有模型都是錯的,但有些很有用

Interpreting climate change data may be the most challenging part.

氣候變化數據的解讀可能是最具有挑戰性的部分。

“When working with big data, it is easy to create a model that explains the correlations that we discover in our data,” Borne said. “But we need to remember that correlation does not imply causation, and so we need to apply systematic scientific methodology.”

波恩指出:“搞大數據時,要建立一個模型來解釋我們在數據中發現的某種關聯是很容易的。但我們得記住,這種關聯並不代表原因,所以我們需要應用系統化的科學方法。”

It’s also important to heed the maxim that “all models are wrong, but some are useful,” Borne said, quoting statistician George Box. “This is especially critical for numerical computer simulations, where there are so many assumptions and ‘parameterizations of our ignorance.’

波恩還指出,搞大數據最好要記住統計學家喬治o博克斯的名言:“所有模型都是錯的,但有些很有用。”他表示:“這對數字計算機模擬尤爲重要,因爲其中有很多假設和‘代表了我們的無知的參數’”。

“What fixes that problem—and also addresses Box’s warning—is data assimilation,” Borne said, referring to the process by which “we incorporate the latest and greatest observational data into the current model of a real system in order to correct, adjust, and validate. Big data play a vital and essential role in climate prediction science by providing corrective actions through ongoing data assimilation.”

波恩表示:“要想解決這個問題,以及解決博克斯警告我們的問題,最重要的是做好數據同化。”也就是“把最新最好的觀測數據納入一個真實系統的實時模型中,以對數據進行糾正、調整、確認。通過以不間斷的數據同化作爲校正措施,大數據在氣候預測科學中扮演了至關重要且不可或缺的角色。

‘We are in a data revolution’

我們已經在一場數據革命之中

Earlier this year, the Obama administration with more than 100 curated, high-quality data sets, Web services, and tools that can be used by anyone to help prepare for the effects of climate change. At the same time, NASA invited citizens to help find solutions to the coastal flooding Challenge at an April mass-collaboration event.

今年早些時候,奧巴馬政府推出了官方的氣象研究網站,上面有100多種精心編輯的高質量數據以及網頁服務和工具,任何人都可以利用這些數據與工具來研究氣候變化的影響。與此同時,NASA也在今年四月的一次大型協作活動上,邀請普通民衆協助其尋找應對沿海洪災的解決方案。

More recently, UN Global Pulse launched a Big Data Climate Challenge to crowdsource projects that use big data to address the economic dimensions of climate change.

最近,聯合國“全球脈動”行動(UN Global Pulse)推出了一項“大數據氣候挑戰”項目,將一些用大數據研究氣候變化對經濟的影響的項目通過衆包的形式進行了發佈。

“We’ve already received submissions from 20 countries in energy, smart cities, forestry and agriculture,” said Miguel Luengo-Oroz, chief scientist for Global Pulse, which focuses on relief and development efforts around the world. “We also hope to see submissions from fields such as architecture, green data centers, risk management and material sciences.”

“全球脈動”行動主要致力於全球各地的扶貧救災與發展事業,該行動的首席科學家盧恩戈o奧羅茲表示:“我們已經收到了來自20多個國家的在能源、智能城市、林業和農業等領域的意見書。我們也希望收到建築、綠色數據中心、風險管理和材料科學等領域的意見書。”

Big data can allow for more efficient responses to emerging crises, distributed access to knowledge, and greater understanding of the effects personal and policy decisions have on the planet’s climate, Luengo-Oroz added.

盧恩戈o奧羅茲補充道,大數據還可以用於提高突發災害的應急工作效率,提供更廣泛地獲取知識的渠道,以及幫助我們更好地瞭解私人與政府的決策會對地球的氣候造成哪些影響。

“But it’s not the data that will save us,” he said. “It’s the analysis and usage of the data that can help us make better decisions for climate action. Just like with climate change, it is no longer a question of, ‘is this happening?’ We are in a data revolution.”

奧羅茲表示:“然而拯救我們的不是那些數據,而是那些讓我們能做出更好的決策來應對氣候變化的數據分析與使用方法。這就像氣候變化本身一樣,現在已經不是‘它開始了嗎’的問題。我們已經在一場數據革命之中。”