Árvores de Decisão na Classificação de Dados Astronômicos

R.S.R Ruiz, H.F. de Campos Velho, R.D.C. Santos, M. Trevisan

Abstract


Os registros de astronomia ótica constituem uma fonte de informação extremamente importante. Estas medidas são fundamentais para classificar estrelas e galáxias. Este trabalho descreve o algoritmo de construção de árvore de decisão (J4.8) e sua aplicação na construção de classificadores baseados em atributos fotométricos para classificar objetos astronômicos em estrelas e galáxias. Dados do projeto Sloan Digital Sky Survey (SDSS) foram utilizados para treinamento e validação dos classificadores desenvolvidos. Os classificadores apresentaram índices de acerto, sobre o conjunto de teste, superiores a 98% para a classificação de estrelas e superiores a 99% para a classificação de galáxias.

References


[1] J. Adelman-Mccarthy et al., The sixth data release of the sloan digital sky survey. The Astrophysical Journal Supplement Series, 175, No. 2 (2008), 297–313.

N. M. Ball et al., Galaxy types in the Sloan Digital Sky Survey using supervised artificial neural networks, Monthly Notices of the Royal Astronomical Society, 348 (2004), 1038–1046.

N. M. Ball, R. J. Brunner, A. D. Myers, Robust machine learning applied to astronomical datasets I: star-galaxy classification of the sloan digital sky survey DR3 using decision trees. The Astrophysical Journal, 650 (2006), 497–509.

D. Bazell, D. W. Aha, Ensembles of classifiers for morphological galaxy classification, The Astrophysical Journal, 548 (2001), 219–223.

L. Breiman, Random forests, Machine Learning, 45, No. 1 (2001), 5–32.

L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, “Classification and regression trees”, U.S.A: Wadsworth Publishing Company, 1984.

R.R Carvalho, H.V. Capelato, H.F. Campos Velho, Um universo escuro na era da tecnologia da informação, Boletim da Sociedade Brasileira de Astronomia -(submetido).

F. Cortiglione, P. Mahonen, P. Hakala, T. Franti, Automated Star-Galaxy discrimination for large surveys, The Astrophysical Journal, 556 (2001), 937–943.

Y. Freud, L. Mason, The alternating decision tree learning algorithm, Proceedings of the Sixteenth International Conference on Machine Learning, (1999), 124–133.

J.P. Huchra, M.J. Geller, Groups of galaxies I. Nearby groups, The Astrophysical Journal, 257 (1982), 423–437.

E.B. Hunt, J. Marin, P.J. Stone, “Experiments in Induction”. New York: Academic Press, 1966.

R. Kohavi, Scaling up the accuracy of naive - Bayes classifiers: a decision tree hybrid. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, AAAI Press, (1996), 202–207.

N. Lin, A.R. Thakar, CasJobs and MyDB: A batch query workbench, Computing in Science and Engineering , 10, No. 1 (2008), 18–29.

M.S. Madsen, “The Dynamic Cosmos - Exploring the Physical Evolution of the Universe”, New York, NY, USA: Chapman e Hall, 1996.

A.S. Miller, M.J. Coe, Star/galaxy classification using Kohonen self-organizing maps. Monthly Notices of the Royal Astronomical Society,279,(1996), 293–300.

K.S. Murty, S. Kasif, S. Salzberg, A system for induction of oblique decision tree, Journal of Artificial Intelligence Research, 2, (1994), 1–32.

V. Petrosian, Surface brightness and evolution of galaxies, The Astrophysical Journal, 209, No. 1 (1976).

J.R. Quinlan, “C4.5: Programs for Machine Learning”. San Mateo, CA: Morgan Kaufman, 1993.

J.R. Quinlan, Induction of decision trees. Machine Learning, 1, No. 1 (1986), 81–106.

S. Salzberg et al., Decision trees for automated identification of cosmic ray hits in hubble space telescope images,Publications of the Astronomical Society of the

Pacific, 107 (1995), 1–10.

C. Stoughton, R.H. Lupton, M. Bernardi, M.R. Blanton, Sloan Digital Sky Survey: early data release. The Astrophysical Journal, 123, (2002), 485–548.

A. Suchkov, R.J. Hanisch, B. Margon, A Census of object types and redshift estimates in the SDSS photometric catalog from a trained decision tree classifier, The Astronomical Journal, 130, (2005), 2439–2452.

A.S. Szalay, A.R. Thakar, J. Gray, The sqlLoader data-loading pipeline, Computing in Science and Engineering, 10, No. 1 (2008), 38–48.

A.R Thakar, A.S. Szalay, G. Fekete, J. Gray, The catalog archive server database management system. Computing in Science and Engineering, 10, No. 1 (2008), 30–37.

I.H. Witten, E. Frank, “Data mining: Practical Machine Learning Tools and Techniques with JAVA Implementations”. San Francisco: Morgan Kaufmann, 2000.

Y. Zhang, Y. Zhao, A comparison of BBN, ADTree and MLP in separating quasars from large survey catalogues, Chinese Journal of Astronomy and Astrophysics, 7, No. 2 (2007), 289–296.

Y. Zhao, Y. Zhang, Comparison of decision tree methods for finding active objects, Advances in Space Research, 41, No. 1 (2008), 1955–1959.




DOI: https://doi.org/10.5540/tema.2009.010.01.0075

Article Metrics

Metrics Loading ...

Metrics powered by PLOS ALM

Refbacks

  • There are currently no refbacks.



Trends in Computational and Applied Mathematics

A publication of the Brazilian Society of Applied and Computational Mathematics (SBMAC)

 

Indexed in:

                       

         

 

Desenvolvido por:

Logomarca da Lepidus Tecnologia