Free Essay

How Will Astronomy Archives Survive the Data Tsunami

In:

Submitted By hottied2
Words 1535
Pages 7
Astronomy of the Cosmos – the last frontier, and the data that supports our incessant desire to obtain knowledge about the universe is causing a surge in data storage, replication, and the need for more. How will this community of scientists and engineers deal with what to keep, what not to keep, and how to keep it when it comes to the vast array of data required our use?

An Astronomical Surge of Data
Degradation in performance should not be the expected course when saving such data. One cannot simply expect that by adding infrastructure as usage increases (common in commercial business enterprises) the problem will be solved. Because astronomy archives generally operate on limited budgets that are fixed for several years, any changes in computer architecture would have to be foreseen and budgeted years early. More so, who do you plan for new discoveries? The current data-access and computing model used in astronomy will be insufficient for future use at the rate it is being collected. Currently, data is downloaded from archives to a local machine to be analyzed. This is done on a very large scale in order to be made accessible to end users. Data discovery, access, and processing are likely to be distributed across several archives (an archive of data warehousing). Given that the maximum that science return will yield from this “federation of data” i.e. multiple archives, a broad wavelength range will be required. Astronomy data is collected and archived in Petabytes (PB). A Petabyte is 1024 Terabytes, or in more simple terms, a million billion bytes. Over time, astronomers have collected data on the stars and their alignments, planets (that have come and gone), and terrestrial beginnings. The common thread in this data is the fact that historical data must be maintained and is the most important factor in the archival capabilities for astronomy as a whole. They are often looking for the next big thing, but must be able to compare it to “what is” or “what was” in order to do that. Subtle changes in astronomy occur sometimes over centennials, not decades. For example, Haley’s Comet is a periodic comet in that every 75 years it appears viewable to us on Earth, but without a history of the comet ever having been seen, we wouldn’t not have been able to track it, thereby knowing it was recurring.
With astronomy, the more years of comparable data you have, the more accurate your predictions become, no matter what the focus of the prediction. This is the way Astronomers have done business. But with the amount of data being captured and kept and analyzed, the more powerful processors we need to fulfill the desired effect. Let’s talk Memory and Storage
Imagine the impact of archiving and processing a PB of data. The astronomical community has been doing just that. A summit yielded a report that recognized important infrastructure failures and their inability to effectively manage the amount and level of data being collected and archived. In 2010, the 2010 Decadal Survey of Astronomy and Astrophysics, commissioned by the National Academy of Sciences, stated a list of priorities for the astronomy community to look at within the next 10 years. This was referred to as the Flash Memory Summit.
It was here that it was noted that the amount of data was rapidly outgrowing the infrastructure currently in place. From storage to processors… we can’t seem to keep up. Hard disk drives are no match for PB of data. Archiving them in data warehouses and the cloud may be possibility, but this amount of data would surely break the bank of a cloud storage or SAAS (storage as a service) System. Gary Gentry, Seagate’s Senior Vice President stated "You can take that as a crisis; we see it as an opportunity”. Gentry was referring to the fact that data storage requirements in 2020 would total around six (6) zettabytes. Just another data note – a zettabyte is a thousand Exabyte, which is a thousand petabytes, which is a thousand terabytes, get it? Think about it in terms of movies, medical artifacts, big-business and corporate data, automotive, banking, and security related information. Not to mention intelligence data and military files. Now can you being to fathom how much data is out there? Now add the astronomical data on to that and you’re beginning to understand the issue of what we need to store and access.

Future Data Storage and Processing
National cyber infrastructure programs must be partnered with businesses that use and develop system resources in order to come up with a working solution. One industry can’t do it alone. There’s just too much to consider to take a chance of missing something important. An investigation of emergent technologies and solutions must be in the forefront. As usual, money is always front and center of problem vice the solution.
Data solutions cost money. Although we get more for the money when it comes to storage today, the cost for research and implementation of new technologies is astronomical to say the least (no pun intended). We are continually limited by our pocketbooks budget, especially with a solution that is just being vetted and has not yet proven its worth. The question is how to you put a price on storage of such important and necessary information? How can you budget for a new discovery?
Two newer technologies being considered are the Graphical Processing Units (GPU) and cloud computing. A GPU (Graphics Processing Unit) is a specialized circuit designed to accelerate the image output in a frame buffer intended for output to a display. GPUs have the ability to manipulating computer graphics in such an efficient manner that they can be more effective than a standard Central Processing Unit (CPU) for calculations and algorithms when processing large blocks of data. GPUs follow where CPUs left off. As more graphic intensive applications are developed, regular CPUs are being phased out in order to allow the heavy lifting of mathematical algorithms to the GPU.
Brian Schmidt, an astronomer at the Research School of Astronomy and Astrophysics at the Australian National University did a lot of research on “big data” in the industry. He noted that new telescopes like, Skymapper, are “creating massive amounts of data, a terabyte of data each night” and that “interdisciplinary groups of researchers to work together to meet these challenges”. He reiterated the fact that “Having digital access to research and reference material through services like the Astrophysics Data System, there are substantial information challenges we are facing that I think libraries, archives and museums could help with.” (2013)

Discovering Data Innovations
Discovering the direction on how to manage the anticipated growth vice not overloaded the system is one of the many technological challenges we face. The VAO (Virtual Astronomical Observatory is in the process of developing an indexing scheme that supports fast, scalable access to massive databases of astronomical data sets. These schema, referred to as R-trees index multidimensional information across archived data warehouses, thereby speeding up the access time. Custom solutions such as these may be more expensive to implement, but are definitely required.

Commercial entities taking a genuine interest in finding and funding such solutions also help to keep the cost down. Taking advantage of commercial cloud solutions are best for memory-intensive applications, of which astronomy definitely is. The ability to process large quantities of image data still needs to be married up with the cost to transfer such data to make it available when needed or for ad hoc. Currently, renting mass storage space on the Amazon cloud is more expensive than purchasing it outright.
The design of cloud computing and SAAS offers a cheaper and much greener alternative to earlier designs. Using the GPU vice CPU as a centrally based GPU cluster is also a step in the right direction. Who knows what data lies beyond the southern sky (or what I can compare it to when looking at images of the southern sky from centuries past)? The truth is, because we know not what we may discover, we need to be prepared to hold on to it for future generations, as those before us did. This need to acquire new technology will continue to bring forth new frontiers, indeed.

Reference
G. BRUCE BERRIMAN, S. L. (2011). DATABASES. Retrieved 2 20, 2014, from HOW WILL ASTRONOMY ARCHIVES SURVIVE THE DATA TSUNAMI?: https://blackboard.strayer.edu/bbcswebdav/pid-10823189-dt-content-rid-58879684_4/institution/CIS/512/1128/Week6-1128/Week6-CaseStudy2-Tsunami.pdf
Hemsoth, N. (2011). Canada explores new frontiers in astroinformatics. Retrieved 2 20, 2014, from In HPC in the Cloud: http://www.hpcinthecloud.com/hpccloud/2011-01-17/canada_explores_new_frontiers_in_astroinformatics.html
Nereus overview. (n.d.). http://www-nereus.physics.ox.ac.uk/about_overview.html.
Rik Myslewski. (2013 , August 14). Seagate: Storage industry ill-prepared for onrushing big data tsunami. Retrieved 2 20, 2014, from The Register: http://www.theregister.co.uk/2013/08/14/storage_industry_unprepared_for_coming_data_needs/
Astronomical Data and Astronomical Digital Stewardship: An interview with Brian Schmidt, published November 14, 2013, Accessed 20 Feb 2014, Author: Trevor Owens, http://blogs.loc.gov/digitalpreservation/2013/11/astronomical-data-and-astronomical-digital- stewardship-an-interview-with-Brian-Schmidt/

Similar Documents

Premium Essay

Earthquake

... temblor or seismic activity) is the result of a sudden release of energy in theEarth's crust that creates seismic waves. Earthquakes are measured with a seismometer; a device which also records is known as aseismograph. The moment magnitude (or the related and mostly obsolete Richter magnitude) of an earthquake is conventionally reported, with magnitude 3 or lower earthquakes being mostly imperceptible and magnitude 7 causing serious damage over large areas. Intensity of shaking is measured on the modified Mercalli scale. At the Earth's surface, earthquakes manifest themselves by shaking and sometimes displacing the ground. When a large earthquakeepicenter is located offshore, the seabed sometimes suffers sufficient displacement to cause a tsunami. The shaking in earthquakes can also trigger landslides and occasionally volcanic activity. In its most generic sense, the word earthquake is used to describe any seismic event — whether a natural phenomenon or an event caused by humans — that generates seismic waves. Earthquakes are caused mostly by rupture of geological faults, but also by volcanic activity, landslides, mine blasts, and nuclear experiments. An earthquake's point of initial rupture is called its focus orhypocenter. The term epicenter refers to the point at ground level directly above the hypocenter. [pic] [pic] Global earthquake epicenters, 1963–1998 [pic] [pic] Global plate tectonic movement |Contents ...

Words: 11888 - Pages: 48

Free Essay

Frontline

...FRONTLINE JANUA RY 1 3, 2 012 WWW.FRONTLINE.IN INDIA’S NATIONAL MAGAZINE RS.25 WORLD AFFAIRS IRAQ FOOD SECURITY PDS CLIMATE CHANGE DURBAN Exit America 49 What people say 96 Uncertain stand 114 Remembering TAGORE On his 150th birth anniversary VOLUME 28 NUMBER 27 TH E STAT E S Fiery trap in Kolkata 41 SC IE NCE Higgs signal? 44 WOR L D A F F A I R S Iraq: Exit America War crimes in the trash Russia: December Revolution Pakistan: Volatile state India & China: Troubled equations DECEMBER 31, 2011 - JANUARY 13, 2012 C O V ER S T O RY 49 52 ISSN 0970-1710 Timeless Tagore As an activist, thinker, poet and rural reconstructionist, Rabindranath Tagore continues to be relevant. A tribute on the 150th anniversary of his birth. 4 WWW.FRONTLINE.IN Jayati Ghosh: Mess in eurozone R.K. Raghavan: A lost battle? 108 118 BOOKS LE TTE R S 73 127 54 57 61 TR AVE L Jungles of Borneo 64 AR T Achuthan Kudallur’s journey 85 H ISTOR Y Of Quit India, Nehru & Communist split 89 FOOD SEC UR I T Y Understanding the PDS Kerala: Power of literacy Bihar: Coupon fiasco Jharkhand: Strong revival Chhattisgarh: Loud no to cash E CONOM Y Losing momentum Interview: C. Rangarajan, Chairman, PMEAC CL IM A TE C H A N G E Uncertain stand in Durban CONTR OV E R S Y Mullaperiyar dispute: Deep distrust Fallout of fear OBITU A R Y Humble genius: Mario Miranda Korea’s...

Words: 77117 - Pages: 309

Free Essay

500 Extraordinary Islands

...500 extraordinary islands G R E E N L A N D Beaufort Sea Baffin Bay vi Da i tra sS t a nm De it Stra rk Hudson Bay Gulf of Alaska Vancouver Portland C A N A D A Calgary Winnipeg Newfoundland Quebec Minneapolis UNITED STATES San Francisco Los Angeles San Diego Phoenix Dallas Ottawa Montreal ChicagoDetroitToronto Boston New York OF AMERICA Philadelphia Washington DC St. Louis Atlanta New Orleans Houston Monterrey NORTH AT L A N T I C OCEAN MEXICO Guadalajara Mexico City Gulf of Mexico Miami Havana CUBA GUATEMALA HONDURAS b e a n Sea EL SALVADOR NICARAGUA Managua BAHAMAS DOMINICAN REPUBLIC JAMAICA San Juan HAITI BELIZE C a r PUERTO RICO ib TRINIDAD & Caracas N TOBAGO A COSTA RICA IA M PANAMA VENEZUELA UYANRINA H GU C U G Medellín A PAC I F I C OCEAN Galapagos Islands COLOMBIA ECUADOR Bogotá Cali S FR EN Belém Recife Lima BR A Z I L PERU La Paz Brasélia Salvador Belo Horizonte Rio de Janeiro ~ Sao Paulo BOLIVIA PARAGUAY CHILE Cordoba Santiago Pôrto Alegre URUGUAY Montevideo Buenos Aires ARGENTINA FALKLAND/MALVINAS ISLANDS South Georgia extraordinary islands 1st Edition 500 By Julie Duchaine, Holly Hughes, Alexis Lipsitz Flippin, and Sylvie Murphy Contents Chapter 1 Beachcomber Islands . . . . . . . . . . . . . . . 1 Aquatic Playgrounds 2 Island Hopping the Turks & Caicos: Barefoot Luxury 12 Life’s a Beach 14 Unvarnished & Unspoiled 21 Sailing...

Words: 249855 - Pages: 1000

Free Essay

Test2

...62118 0/nm 1/n1 2/nm 3/nm 4/nm 5/nm 6/nm 7/nm 8/nm 9/nm 1990s 0th/pt 1st/p 1th/tc 2nd/p 2th/tc 3rd/p 3th/tc 4th/pt 5th/pt 6th/pt 7th/pt 8th/pt 9th/pt 0s/pt a A AA AAA Aachen/M aardvark/SM Aaren/M Aarhus/M Aarika/M Aaron/M AB aback abacus/SM abaft Abagael/M Abagail/M abalone/SM abandoner/M abandon/LGDRS abandonment/SM abase/LGDSR abasement/S abaser/M abashed/UY abashment/MS abash/SDLG abate/DSRLG abated/U abatement/MS abater/M abattoir/SM Abba/M Abbe/M abbé/S abbess/SM Abbey/M abbey/MS Abbie/M Abbi/M Abbot/M abbot/MS Abbott/M abbr abbrev abbreviated/UA abbreviates/A abbreviate/XDSNG abbreviating/A abbreviation/M Abbye/M Abby/M ABC/M Abdel/M abdicate/NGDSX abdication/M abdomen/SM abdominal/YS abduct/DGS abduction/SM abductor/SM Abdul/M ab/DY abeam Abelard/M Abel/M Abelson/M Abe/M Aberdeen/M Abernathy/M aberrant/YS aberrational aberration/SM abet/S abetted abetting abettor/SM Abeu/M abeyance/MS abeyant Abey/M abhorred abhorrence/MS abhorrent/Y abhorrer/M abhorring abhor/S abidance/MS abide/JGSR abider/M abiding/Y Abidjan/M Abie/M Abigael/M Abigail/M Abigale/M Abilene/M ability/IMES abjection/MS abjectness/SM abject/SGPDY abjuration/SM abjuratory abjurer/M abjure/ZGSRD ablate/VGNSDX ablation/M ablative/SY ablaze abler/E ables/E ablest able/U abloom ablution/MS Ab/M ABM/S abnegate/NGSDX abnegation/M Abner/M abnormality/SM abnormal/SY aboard ...

Words: 113589 - Pages: 455