{"id":143,"date":"2026-04-17T11:30:55","date_gmt":"2026-04-17T16:30:55","guid":{"rendered":"https:\/\/academia.utp.edu.co\/ia-e-industria\/?page_id=143"},"modified":"2026-04-17T11:36:33","modified_gmt":"2026-04-17T16:36:33","slug":"seguimiento-visual-de-un-manipulador-serial-utilizando-dnn","status":"publish","type":"page","link":"https:\/\/academia.utp.edu.co\/ia-e-industria\/seguimiento-visual-de-un-manipulador-serial-utilizando-dnn\/","title":{"rendered":"Seguimiento Visual de un Manipulador Serial Utilizando DNN"},"content":{"rendered":"\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<h2 class=\"wp-block-heading has-text-align-center has-black-color has-text-color has-link-color wp-elements-d94ebff2c37deb031b3d33cecb11aeac\"><strong>Autores:<\/strong><\/h2>\n\n\n\n<p class=\"has-black-color has-text-color has-link-color wp-elements-5ded261aa156fff218aaaf62f714a6bb\">Kevin David Ortega Qui\u00f1ones, Byron S. Hern\u00e1ndez, Jorge I. Sep\u00falveda, Henry Medeiros, Germ\u00e1n Andr\u00e9s Holgu\u00edn Londo\u00f1o.<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\"><\/div>\n<\/div>\n\n\n\n<h1 class=\"wp-block-heading has-black-color has-text-color has-link-color wp-elements-ad50cd65d6c6333d549ac7ca33df6254\"><strong>Problema<\/strong><\/h1>\n\n\n\n<p class=\"has-black-color has-text-color has-link-color wp-elements-fd87008b76a5816353203b3e4a982c4c\">En la industria, se presenta un desaf\u00edo cr\u00edtico relacionado con la estimaci\u00f3n de la pose (posici\u00f3n y orientaci\u00f3n) del efector final de un brazo rob\u00f3tico articulado de 6 grados de libertad. Tradicionalmente, esta estimaci\u00f3n se ha llevado a cabo utilizando encoders ubicados en las articulaciones del robot. Sin embargo, esta aproximaci\u00f3n convencional se enfrenta a un problema significativo, ya que los encoders pueden acumular errores a lo largo del tiempo debido a diversas fuentes, como la fricci\u00f3n, el juego mec\u00e1nico y las tolerancias de fabricaci\u00f3n.<\/p>\n\n\n\n<p class=\"has-black-color has-text-color has-link-color wp-elements-03477f4651fc62125c8dcbb6dbe3248b\">Como respuesta a esta problem\u00e1tica en la industria, se propone una alternativa de bajo costo y altamente efectiva: la implementaci\u00f3n de un sistema de estimaci\u00f3n visual basado en c\u00e1maras RGBD. Estas c\u00e1maras, que proporcionan informaci\u00f3n de color y profundidad en tiempo real, ofrecen un enfoque innovador para resolver este desaf\u00edo. A trav\u00e9s de la captura de datos visuales del entorno y el seguimiento preciso del brazo rob\u00f3tico en acci\u00f3n, se busca determinar con exactitud la posici\u00f3n tridimensional y la orientaci\u00f3n del efector final.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-medium\"><img loading=\"lazy\" decoding=\"async\" width=\"300\" height=\"300\" src=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/14f9f113-7f9a-4f3d-8097-2aeda30322d7-300x300.png\" alt=\"\" class=\"wp-image-69\" srcset=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/14f9f113-7f9a-4f3d-8097-2aeda30322d7-300x300.png 300w, https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/14f9f113-7f9a-4f3d-8097-2aeda30322d7-150x150.png 150w, https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/14f9f113-7f9a-4f3d-8097-2aeda30322d7.png 463w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><figcaption class=\"wp-element-caption\">Figura\u00a01:\u00a0Brazo UR5 en el entorno de simulaci\u00f3n Gazebo.<\/figcaption><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading has-black-color has-text-color has-link-color wp-elements-e59db135a866c4ee50dc2d6b2b0eb793\"><strong>Base de datos<\/strong><\/h2>\n\n\n\n<p class=\"has-black-color has-text-color has-link-color wp-elements-d2ad1b54c028bdf5be5f4a701b4b721d\">Para entrenar el sistema, se cre\u00f3 una base de datos simulada utilizando ROS, Gazebo y un modelo de un brazo UR5. Se situaron 3 c\u00e1maras RGBD en posiciones fijas apuntando hacia el brazo. Mediante muestreo de Montecarlo se generaron poses aleatorias del brazo dentro de su espacio de trabajo. Para cada pose generada, se guardaron las im\u00e1genes RGB y de profundidad de las 3 c\u00e1maras.<\/p>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\"><div class=\"wp-block-image\">\n<figure class=\"aligncenter size-medium\"><img loading=\"lazy\" decoding=\"async\" width=\"223\" height=\"300\" src=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig2a-223x300.png\" alt=\"\" class=\"wp-image-145\" srcset=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig2a-223x300.png 223w, https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig2a.png 295w\" sizes=\"auto, (max-width: 223px) 100vw, 223px\" \/><\/figure>\n<\/div><\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\"><div class=\"wp-block-image\">\n<figure class=\"aligncenter size-medium\"><img loading=\"lazy\" decoding=\"async\" width=\"223\" height=\"300\" src=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig2b-223x300.png\" alt=\"\" class=\"wp-image-146\" srcset=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig2b-223x300.png 223w, https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig2b.png 295w\" sizes=\"auto, (max-width: 223px) 100vw, 223px\" \/><\/figure>\n<\/div><\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\"><div class=\"wp-block-image\">\n<figure class=\"aligncenter size-medium\"><img loading=\"lazy\" decoding=\"async\" width=\"223\" height=\"300\" src=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig2c-223x300.png\" alt=\"\" class=\"wp-image-147\" srcset=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig2c-223x300.png 223w, https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig2c.png 295w\" sizes=\"auto, (max-width: 223px) 100vw, 223px\" \/><\/figure>\n<\/div><\/div>\n<\/div>\n\n\n\n<p class=\"has-text-align-center has-black-color has-text-color has-link-color wp-elements-3800eae23589bedc0cb0f0e7125fd3bc\">Figura\u00a02:\u00a0im\u00e1genes RGB de las c\u00e1maras.<\/p>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\"><div class=\"wp-block-image\">\n<figure class=\"aligncenter size-medium\"><img loading=\"lazy\" decoding=\"async\" width=\"223\" height=\"300\" src=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig3a-223x300.png\" alt=\"\" class=\"wp-image-149\" srcset=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig3a-223x300.png 223w, https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig3a.png 295w\" sizes=\"auto, (max-width: 223px) 100vw, 223px\" \/><\/figure>\n<\/div><\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\"><div class=\"wp-block-image\">\n<figure class=\"aligncenter size-medium\"><img loading=\"lazy\" decoding=\"async\" width=\"223\" height=\"300\" src=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig3b-223x300.png\" alt=\"\" class=\"wp-image-150\" srcset=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig3b-223x300.png 223w, https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig3b.png 295w\" sizes=\"auto, (max-width: 223px) 100vw, 223px\" \/><\/figure>\n<\/div><\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\"><div class=\"wp-block-image\">\n<figure class=\"aligncenter size-medium\"><img loading=\"lazy\" decoding=\"async\" width=\"223\" height=\"300\" src=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig3c-223x300.png\" alt=\"\" class=\"wp-image-151\" srcset=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig3c-223x300.png 223w, https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig3c.png 295w\" sizes=\"auto, (max-width: 223px) 100vw, 223px\" \/><\/figure>\n<\/div><\/div>\n<\/div>\n\n\n\n<p class=\"has-text-align-center has-black-color has-text-color has-link-color wp-elements-29ec1a456016e63cf9641f7550720321\">Figura\u00a03: im\u00e1genes\u00a0de profundidad\u00a0de las c\u00e1maras.<\/p>\n\n\n\n<h2 class=\"wp-block-heading has-black-color has-text-color has-link-color wp-elements-3ddda2e640d6aee28b24aa63d26ceb8c\"><strong>Etiquetado<\/strong><\/h2>\n\n\n\n<p class=\"has-black-color has-text-color has-link-color wp-elements-f1738ba845bd4ac8a90d6bee7ba756d4\">Adem\u00e1s de la captura de datos visuales con las c\u00e1maras RGBD, se realiz\u00f3 un proceso adicional de etiquetado minucioso que a\u00f1ade un nivel de detalle crucial a la informaci\u00f3n obtenida. Este proceso implic\u00f3 la asignaci\u00f3n de coordenadas tridimensionales (3D) a cada una de las 6 articulaciones del brazo rob\u00f3tico, generando un conjunto completo de 18 etiquetas. Para proporcionar una visi\u00f3n m\u00e1s clara de este procedimiento, es importante destacar que estas etiquetas constan de tres valores que representan la posici\u00f3n en tres ejes diferentes: [x, y, z] para cada una de las articulaciones.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"918\" height=\"194\" src=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/tabla1.png\" alt=\"\" class=\"wp-image-152\" srcset=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/tabla1.png 918w, https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/tabla1-300x63.png 300w, https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/tabla1-768x162.png 768w\" sizes=\"auto, (max-width: 918px) 100vw, 918px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"has-text-align-center has-black-color has-text-color has-link-color wp-elements-308f1c75abda975f3fe4de934c55fc1d\">Tabla 1: estructura de datos etiquetados.<\/p>\n\n\n\n<h2 class=\"wp-block-heading has-black-color has-text-color has-link-color wp-elements-d75f481eec7f70c388fcb7b25c6cd9f7\"><strong>Estructura Propuesta<\/strong><\/h2>\n\n\n\n<p class=\"has-black-color has-text-color has-link-color wp-elements-d078af2fb67bf38b9494f67f0f7c61eb\">El modelo propuesto consiste en una red neuronal convolucional (CNN) para procesar las im\u00e1genes de las 3 c\u00e1maras RGBD. Como columna vertebral de la CNN se utiliza la arquitectura ResNet50 para extraer un vector de caracter\u00edsticas de alta dimensi\u00f3n a partir de las im\u00e1genes de entrada.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"955\" height=\"302\" src=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig4.png\" alt=\"\" class=\"wp-image-153\" srcset=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig4.png 955w, https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig4-300x95.png 300w, https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig4-768x243.png 768w\" sizes=\"auto, (max-width: 955px) 100vw, 955px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"has-text-align-center has-black-color has-text-color has-link-color wp-elements-4569cfc6709260d1e74032a8af60ed82\">Figura 4: sistema propuesto para la estimaci\u00f3n de la pose del efector final.<\/p>\n\n\n\n<p class=\"has-black-color has-text-color has-link-color wp-elements-f880b1f2d9bc102fa72db9730290ee04\">Luego, este vector de caracter\u00edsticas se pasa a trav\u00e9s de una cascada de redes neuronales totalmente conectadas, donde cada red predice las coordenadas 3D [x, y, z] de una de las articulaciones del brazo rob\u00f3tico. En total se tienen 6 redes neuronales totalmente conectadas en cascada, una para cada articulaci\u00f3n.<\/p>\n\n\n\n<p class=\"has-black-color has-text-color has-link-color wp-elements-761d920d1acab6cdd01b13d2e560e431\">Se compararon 3 variantes de este enfoque:<\/p>\n\n\n\n<p class=\"has-black-color has-text-color has-link-color wp-elements-e8cb629512cf0ef54dd921b49ced4d06\"><strong>L\u00ednea base:<\/strong>&nbsp;Cada red neuronal totalmente conectada recibe \u00fanicamente el vector de caracter\u00edsticas extra\u00eddo por ResNet50. No hay paso de informaci\u00f3n entre las redes.<\/p>\n\n\n\n<p class=\"has-black-color has-text-color has-link-color wp-elements-107ece56b6ac24275805b5b04f11b60a\"><strong>Cascada:<\/strong>&nbsp;Cada red neuronal no solo recibe el vector de ResNet50, sino tambi\u00e9n las salidas (coordenadas 3D) de la red inmediatamente anterior.<\/p>\n\n\n\n<p class=\"has-black-color has-text-color has-link-color wp-elements-67b7bf5dc8d9f849afdf53ed29dc3f93\"><strong>Cascada completa:<\/strong>&nbsp;Cada red recibe el vector de ResNet50 y las salidas de todas las redes anteriores en la cascada.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"998\" height=\"261\" src=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig5.png\" alt=\"\" class=\"wp-image-154\" srcset=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig5.png 998w, https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig5-300x78.png 300w, https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig5-768x201.png 768w\" sizes=\"auto, (max-width: 998px) 100vw, 998px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"has-text-align-center has-black-color has-text-color has-link-color wp-elements-b6f89ce78f65785b8d43a8795b9fd47b\">Figura\u00a05:\u00a0CNN seguida de una estructura en cascada totalmente conectada.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"956\" height=\"395\" src=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig6.png\" alt=\"\" class=\"wp-image-155\" srcset=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig6.png 956w, https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig6-300x124.png 300w, https:\/\/academia.utp.edu.co\/ia-e-industria\/files\/2026\/04\/fig6-768x317.png 768w\" sizes=\"auto, (max-width: 956px) 100vw, 956px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"has-text-align-center has-black-color has-text-color has-link-color wp-elements-94bbebec465ec1953dbf2761bc526f5e\">Figura\u00a06:\u00a0bloque totalmente conectado. Red Neuronal 1000-256-64-3.<\/p>\n\n\n\n<p class=\"has-black-color has-text-color has-link-color wp-elements-336e722ff70e0615d0b1107e43600f27\">La hip\u00f3tesis muestra que proporcionar a cada red las estimaciones previas de las articulaciones mejorar\u00eda el rendimiento al agregar informaci\u00f3n adicional relevante para cada predicci\u00f3n. El m\u00e9todo de cascada completa mostr\u00f3 el mejor desempe\u00f1o durante la validaci\u00f3n.<\/p>\n\n\n\n<h2 class=\"wp-block-heading has-black-color has-text-color has-link-color wp-elements-9662f4ef4cae0a1be119e479a9e91c12\"><strong>Resultados<\/strong><\/h2>\n\n\n\n<p class=\"has-black-color has-text-color has-link-color wp-elements-3ff1e876c0a3cfb0be074c5e7b24f8fb\">El m\u00e9todo de cascada completa, donde cada etapa recibe informaci\u00f3n de todas las etapas previas, obtuvo el menor error de validaci\u00f3n. Esto indica que proporcionar las estimaciones previas mejora el rendimiento del modelo. Visualmente tambi\u00e9n se observ\u00f3 una buena precisi\u00f3n de las predicciones en 3D.<\/p>\n\n\n\n<h2 class=\"wp-block-heading has-black-color has-text-color has-link-color wp-elements-c3b08c368fd108225435060b75217471\"><strong>Bibtex<\/strong><\/h2>\n\n\n\n<pre class=\"wp-block-code has-black-color has-text-color has-link-color wp-elements-324e311ccef7c08b352b0a2b7ad3fe36\"><code>@article{ortegaKevin2023,\n\n  title={Seguimiento visual de un manipulador serial utilizando redes neuronales profundas},\n\n  author={Ortega, Kevin David},\n\n  year={2023},\n\n  school={Universidad Tecnol{\\'o}gica de Pereira}\n\n}<\/code><\/pre>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Autores: Kevin David Ortega Qui\u00f1ones, Byron S. Hern\u00e1ndez, Jorge I. Sep\u00falveda, Henry Medeiros, Germ\u00e1n Andr\u00e9s Holgu\u00edn Londo\u00f1o. Problema En la industria, se presenta un desaf\u00edo cr\u00edtico relacionado con la estimaci\u00f3n&hellip;<\/p>\n<p class=\"more-link\"><a href=\"https:\/\/academia.utp.edu.co\/ia-e-industria\/seguimiento-visual-de-un-manipulador-serial-utilizando-dnn\/\" class=\"themebutton2\">Leer m\u00e1s<\/a><\/p>\n","protected":false},"author":1980,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-143","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/academia.utp.edu.co\/ia-e-industria\/wp-json\/wp\/v2\/pages\/143","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/academia.utp.edu.co\/ia-e-industria\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/academia.utp.edu.co\/ia-e-industria\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/academia.utp.edu.co\/ia-e-industria\/wp-json\/wp\/v2\/users\/1980"}],"replies":[{"embeddable":true,"href":"https:\/\/academia.utp.edu.co\/ia-e-industria\/wp-json\/wp\/v2\/comments?post=143"}],"version-history":[{"count":5,"href":"https:\/\/academia.utp.edu.co\/ia-e-industria\/wp-json\/wp\/v2\/pages\/143\/revisions"}],"predecessor-version":[{"id":161,"href":"https:\/\/academia.utp.edu.co\/ia-e-industria\/wp-json\/wp\/v2\/pages\/143\/revisions\/161"}],"wp:attachment":[{"href":"https:\/\/academia.utp.edu.co\/ia-e-industria\/wp-json\/wp\/v2\/media?parent=143"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}