{"id":829,"date":"2025-07-29T20:36:03","date_gmt":"2025-07-30T00:36:03","guid":{"rendered":"https:\/\/literaciadigital.ufms.br\/?page_id=829"},"modified":"2025-10-11T17:16:49","modified_gmt":"2025-10-11T21:16:49","slug":"15-3","status":"publish","type":"page","link":"https:\/\/literaciadigital.ufms.br\/en\/data8\/15-0\/15-3\/","title":{"rendered":"Cap\u00edtulo 15.3"},"content":{"rendered":"<div style=\"position: relative\">\n<div style=\"float: left;width: 300px;background-color: #f5f5f5;border: 1px solid #ddd;border-radius: 5px;padding: 15px;margin-right: 20px;margin-bottom: 5px;overflow: hidden\">\n<h3 style=\"margin: 0 0 10px 0;padding-bottom: 8px;border-bottom: 1px solid #ddd\">\u00cdndice<\/h3>\n<ol style=\"margin: 0;padding-left: 0;list-style-type: none\">\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/1-0\/\">1. O que \u00e9 Ci\u00eancia de Dados?<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/1-0\/1-1\/\">1.1. Introdu\u00e7\u00e3o<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/1-0\/1-1\/1-1\/\">1.1.1. Ferramentas Computacionais<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/1-0\/1-1\/1-2\/\">1.1.2. T\u00e9cnicas Estat\u00edsticas<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/1-0\/1-2\/\">1.2. Por que Ci\u00eancia de Dados?<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/1-0\/1-3\/\">1.3. Tra\u00e7ando os Cl\u00e1ssicos<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/1-0\/1-3\/3-1\/\">1.3.1. Personagens Liter\u00e1rios<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/1-0\/1-3\/3-2\/\">1.3.2. Outro Tipo de Personagem<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/2-0\/\">2. Causalidade e Experimentos<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/2-0\/2-1\/\">2.1. John Snow e a Bomba da Broad Street<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/2-0\/2-2\/\">2.2. O &#8220;Grande Experimento&#8221; de Snow<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/2-0\/2-3\/\">2.3. Estabelecendo Causalidade<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/2-0\/2-4\/\">2.4. Randomiza\u00e7\u00e3o<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/2-0\/2-5\/\">2.5. Notas Finais<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/3-0\/\">3. Progamando em Python<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/3-0\/3-1\/\">3.1. Express\u00f5es<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/3-0\/3-2\/\">3.2. Nomes<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/3-0\/3-2\/2-1\/\">3.2.1. Exemplo: Taxas de Crescimento<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/3-0\/3-3\/\">3.3. Chamadas<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/3-0\/3-4\/\">3.4. Introdu\u00e7\u00e3o \u00e0s Tabelas<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/4-0\/\">4. Tipos de Dados<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/4-0\/4-1\/\">4.1. N\u00fameros<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/4-0\/4-2\/\">4.2. Strings<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/4-0\/4-2\/2-1\/\">4.2.1. M\u00e9todos de Strings<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/4-0\/4-3\/\">4.3. Compara\u00e7\u00f5es<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/5-0\/\">5. Sequ\u00eancias<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/5-0\/5-1\/\">5.1. Arrays<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/5-0\/5-2\/\">5.2. Ranges<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/5-0\/5-3\/\">5.3. Mais sobre Arrays<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/6-0\/\">6. Tabelas<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/6-0\/6-1\/\">6.1. Ordenando Linhas<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/6-0\/6-2\/\">6.2. Selecionando Linhas<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/6-0\/6-3\/\">6.3. Exemplo: Tend\u00eancias Populacionais<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/6-0\/6-4\/\">6.4. Examplo: Propor\u00e7\u00f5es de Sexos<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/7-0\/\">7. Visualiza\u00e7\u00e3o<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/7-0\/7-1\/\">7.1. Visualizando Distribui\u00e7\u00f5es<br \/>\nCateg\u00f3ricas<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/7-0\/7-2\/\">7.2. Visualizando Distribui\u00e7\u00f5es Num\u00e9ricas<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/7-0\/7-3\/\">7.3. Gr\u00e1ficos Sobrepostos<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/8-0\/\">8. Fun\u00e7\u00f5es e Tabelas<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/8-0\/8-1\/\">8.1. Aplicando Fun\u00e7\u00e3o a uma Coluna<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/8-0\/8-2\/\">8.2. Classificando por uma Vari\u00e1vel<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/8-0\/8-3\/\">8.3. Classifica\u00e7\u00e3o Cruzada<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/8-0\/8-4\/\">8.4. Unindo Tabelas por Colunas<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/8-0\/8-5\/\">8.5. Compartilhamento de Bicicletas<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/9-0\/\">9. Aleatoriedade<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/9-0\/9-1\/\">9.1. Declara\u00e7\u00f5es Condicionais<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/9-0\/9-2\/\">9.2. Itera\u00e7\u00e3o<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/9-0\/9-3\/\">9.3. Simula\u00e7\u00e3o<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/9-0\/9-4\/\">9.4. O Problema de Monty Hall<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/9-0\/9-5\/\">9.5. Encontrando Probabilidades<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/10-0\/\">10. Amostragem e Distribui\u00e7\u00f5es Emp\u00edricas<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/10-0\/10-1\/\">10.1. Distribui\u00e7\u00f5es Emp\u00edricas<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/10-0\/10-2\/\">10.2. Amostragem de uma Popula\u00e7\u00e3o<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/10-0\/10-3\/\">10.3. Distribui\u00e7\u00e3o Emp\u00edrica de uma<br \/>\nEstat\u00edstica<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/10-0\/10-4\/\">10.4. Amostragem Aleat\u00f3ria em Python <\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/11-0\/\">11. Testando Hip\u00f3teses<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/11-0\/11-1\/\">11.1. Avaliando um Modelo<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/11-0\/11-2\/\">11.2. M\u00faltiplas Categorias<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/11-0\/11-3\/\">11.3. Decis\u00f5es e Incertezas<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/11-0\/11-4\/\">11.4. Probabilidades de Erro<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/12-0\/\">12. Comparando Duas Amostras<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/12-0\/12-1\/\">12.1. Teste A\/B<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/12-0\/12-2\/\">12.2. Causalidade<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/12-0\/12-3\/\">12.3. Esvaziar<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/13-0\/\">13. Estima\u00e7\u00e3o<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/13-0\/13-1\/\">13.1. Percentis<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/13-0\/13-2\/\">13.2. O Bootstrap<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/13-0\/13-3\/\">13.3. Intervalos de Confian\u00e7a<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/13-0\/13-4\/\">13.4. Usando Intervalos de Confian\u00e7a<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/14-0\/\">14. Por que a M\u00e9dia \u00e9 Importante<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/14-0\/14-1\/\">14.1. Propriedades da M\u00e9dia<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/14-0\/14-2\/\">14.2. Variabilidade<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/14-0\/14-3\/\">14.3. O DP e a Curva Normal<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/14-0\/14-4\/\">14.4. Teorema Central do Limite<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/14-0\/14-5\/\">14.5. Variabilidade da M\u00e9dia da Amostra<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/14-0\/14-6\/\">14.6. Escolhendo um Tamanho de Amostra<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/15-0\/\">15. Previs\u00e3o<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/15-0\/15-1\/\">15.1. Correla\u00e7\u00e3o<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/15-0\/15-2\/\">15.2. Linha de Regress\u00e3o<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/15-0\/15-3\/\">15.3. M\u00e9todo dos M\u00ednimos Quadrados<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/15-0\/15-4\/\">15.4. Regress\u00e3o de M\u00ednimos Quadrados<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/15-0\/15-5\/\">15.5. Diagn\u00f3sticos Visuais<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/15-0\/15-6\/\">15.6. Diagn\u00f3stico Num\u00e9rico<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<\/div>\n<p><!-- Main Content --><\/p>\n<div style=\"overflow: hidden\">\n<p><!--###########################################################################################################################################################--><\/p>\n<pre><code><span style=\"color: black\">from datascience import *\r\n%matplotlib inline\r\npath_data = '..\/..\/..\/assets\/data\/'\r\nimport matplotlib.pyplot as plots\r\nplots.style.use('fivethirtyeight')\r\nimport numpy as np<\/span><\/code><\/pre>\n<p>&nbsp;<\/p>\n<h1 id=\"o-m-todo-dos-m-nimos-quadrados\" style=\"text-align: center\">O M\u00e9todo dos M\u00ednimos Quadrados<\/h1>\n<p style=\"text-align: justify\">Desenvolvemos a equa\u00e7\u00e3o da linha de regress\u00e3o que passa por um gr\u00e1fico de dispers\u00e3o em formato de bola de futebol. Mas nem todos os gr\u00e1ficos de dispers\u00e3o t\u00eam esse formato, nem mesmo os lineares. Todo gr\u00e1fico de dispers\u00e3o possui uma linha &#8220;melhor&#8221; que o atravessa? Em caso afirmativo, podemos usar as f\u00f3rmulas para a inclina\u00e7\u00e3o e intercepta\u00e7\u00e3o desenvolvidas na se\u00e7\u00e3o anterior ou precisamos de novas?<\/p>\n<p style=\"text-align: justify\">Para abordar essas quest\u00f5es, precisamos de uma defini\u00e7\u00e3o razo\u00e1vel de &#8220;melhor&#8221;. Lembre-se de que o prop\u00f3sito da linha \u00e9 <em>prever<\/em> ou <em>estimar<\/em> valores de y, dados valores de x. As estimativas geralmente n\u00e3o s\u00e3o perfeitas. Cada uma est\u00e1 afastada do valor verdadeiro por um <em>erro<\/em>. Um crit\u00e9rio razo\u00e1vel para uma linha ser a &#8220;melhor&#8221; \u00e9 que ela tenha o menor erro geral poss\u00edvel entre todas as linhas retas.<\/p>\n<p style=\"text-align: justify\">Nesta se\u00e7\u00e3o, tornaremos esse crit\u00e9rio preciso e veremos se podemos identificar a melhor linha reta segundo o crit\u00e9rio.<\/p>\n<pre><code><span style=\"color: black\">def standard_units(any_numbers):\r\n    \"Converta qualquer array de n\u00fameros em unidades padr\u00e3o.\"\r\n    return (any_numbers - np.mean(any_numbers))\/np.std(any_numbers)\r\n\r\ndef correlation(t, x, y):\r\n    return np.mean(standard_units(t.column(x))*standard_units(t.column(y)))\r\n\r\ndef slope(table, x, y):\r\n    r = correlation(table, x, y)\r\n    return r * np.std(table.column(y))\/np.std(table.column(x))\r\n\r\ndef intercept(table, x, y):\r\n    a = slope(table, x, y)\r\n    return np.mean(table.column(y)) - a * np.mean(table.column(x))\r\n\r\ndef fit(table, x, y):\r\n    \"\"\"Retorne a altura da linha de regress\u00e3o em cada valor de x.\"\"\"\r\n    a = slope(table, x, y)\r\n    b = intercept(table, x, y)\r\n    return a * table.column(x) + b<\/span><\/code><\/pre>\n<p style=\"text-align: justify\">Nosso primeiro exemplo \u00e9 um conjunto de dados que tem uma linha para cada cap\u00edtulo do romance &#8220;Little Women.&#8221; O objetivo \u00e9 estimar o n\u00famero de caracteres (isto \u00e9, letras, espa\u00e7os, sinais de pontua\u00e7\u00e3o e assim por diante) com base no n\u00famero de pontos finais. Lembre-se de que tentamos fazer isso na primeira aula deste curso.<\/p>\n<pre><code><span style=\"color: black\">little_women = Table.read_table(path_data + 'little_women.csv')\r\nlittle_women = little_women.move_to_start('Periods')\r\nlittle_women.show(3)<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-collapse: collapse;width: auto;margin-left: 1em\" border=\"1\">\n<thead>\n<tr style=\"background-color: #f0f0f0;border-bottom: 2px solid #ddd\">\n<th style=\"text-align: left;padding: 4px 8px\">Periods<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Characters<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">189<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">21759<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">188<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">22148<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">231<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">20558<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<pre><code><span style=\"color: black\">little_women.scatter('Periods', 'Characters')<\/span><\/code><\/pre>\n<p style=\"text-align: justify\"><img fetchpriority=\"high\" decoding=\"async\" class=\"alignnone size-full wp-image-831\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-1.png\" alt=\"\" width=\"394\" height=\"342\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-1.png 394w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-1-300x260.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-1-369x320.png 369w\" sizes=\"(max-width: 394px) 100vw, 394px\" \/><\/p>\n<p style=\"text-align: justify\">Para explorar os dados, precisaremos usar as fun\u00e7\u00f5es <code>correlation<\/code>, <code>slope<\/code>, <code>intercept<\/code>, e <code>fit<\/code>definido na se\u00e7\u00e3o anterior.<\/p>\n<pre><code><span style=\"color: black\">correlation(little_women, 'Periods', 'Characters')<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[1]:<\/td>\n<td style=\"text-align: left\">0.9229576895854816<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">O gr\u00e1fico de dispers\u00e3o \u00e9 notavelmente pr\u00f3ximo do linear e a correla\u00e7\u00e3o \u00e9 superior a 0,92.<\/p>\n<h2 id=\"erro-na-estimativa\" style=\"text-align: justify\">Erro na Estimativa<\/h2>\n<p style=\"text-align: justify\">O gr\u00e1fico abaixo mostra o gr\u00e1fico de dispers\u00e3o e a linha que desenvolvemos na se\u00e7\u00e3o anterior. Ainda n\u00e3o sabemos se essa \u00e9 a melhor entre todas as linhas. Primeiro temos que dizer precisamente o que &#8220;melhor&#8221; significa.<\/p>\n<pre><code><span style=\"color: black\">lw_with_predictions = little_women.with_column('Linear Prediction', fit(little_women, 'Periods', 'Characters'))\r\nlw_with_predictions.scatter('Periods')<\/span><\/code><\/pre>\n<p style=\"text-align: justify\"><img decoding=\"async\" class=\"alignnone size-full wp-image-832\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-2.png\" alt=\"\" width=\"566\" height=\"345\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-2.png 566w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-2-300x183.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-2-525x320.png 525w\" sizes=\"(max-width: 566px) 100vw, 566px\" \/><\/p>\n<p style=\"text-align: justify\">Correspondente a cada ponto do gr\u00e1fico de dispers\u00e3o, existe um erro de previs\u00e3o calculado como o valor real menos o valor previsto. \u00c9 a dist\u00e2ncia vertical entre o ponto e a linha, com sinal negativo se o ponto estiver abaixo da linha.<\/p>\n<pre><code><span style=\"color: black\">actual = lw_with_predictions.column('Characters')\r\npredicted = lw_with_predictions.column('Linear Prediction')\r\nerrors = actual - predicted<\/span><\/code><\/pre>\n<pre><code><span style=\"color: black\">lw_with_predictions.with_column('Error', errors)<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-collapse: collapse;width: auto;margin-left: 1em\" border=\"1\">\n<thead>\n<tr style=\"background-color: #f0f0f0;border-bottom: 2px solid #ddd\">\n<th style=\"text-align: left;padding: 4px 8px\">Periods<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Characters<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Linear Prediction<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Error<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">189<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">21759<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">21183.6<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">575.403<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">188<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">22148<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">21096.6<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">1051.38<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">231<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">20558<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">24836.7<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">-4278.67<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">195<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">25526<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">21705.5<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">3820.54<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">255<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">23395<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">26924.1<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">-3529.13<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">140<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">14622<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">16921.7<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">-2299.68<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">131<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">14431<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">16138.9<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">-1707.88<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">214<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">22476<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">23358<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">-882.043<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">337<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">33767<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">34056.3<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">-289.317<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">185<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">18508<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">20835.7<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">-2327.69<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">Podemos usar <code>slope<\/code> e <code>intercept<\/code> para calcular a inclina\u00e7\u00e3o e a intercepta\u00e7\u00e3o da linha ajustada. O gr\u00e1fico abaixo mostra a linha (em azul claro). Os erros correspondentes a quatro dos pontos s\u00e3o mostrados em vermelho. N\u00e3o h\u00e1 nada de especial nesses quatro pontos. Eles foram escolhidos apenas pela clareza da exibi\u00e7\u00e3o. A fun\u00e7\u00e3o <code>lw_errors<\/code> pega uma inclina\u00e7\u00e3o e uma intercepta\u00e7\u00e3o (nessa ordem) como argumentos e desenha a figura.<\/p>\n<pre><code><span style=\"color: black\">lw_reg_slope = slope(little_women, 'Periods', 'Characters')\r\nlw_reg_intercept = intercept(little_women, 'Periods', 'Characters')<\/span><\/code><\/pre>\n<pre><code><span style=\"color: black\">\r\nsample = [[131, 14431], [231, 20558], [392, 40935], [157, 23524]]\r\ndef lw_errors(slope, intercept):\r\n    little_women.scatter('Periods', 'Characters')\r\n    xlims = np.array([50, 450])\r\n    plots.plot(xlims, slope * xlims + intercept, lw=2)\r\n    for x, y in sample:\r\n        plots.plot([x, x], [y, slope * x + intercept], color='r', lw=2)<\/span><\/code><\/pre>\n<pre><code><span style=\"color: black\">print('Slope of Regression Line:    ', np.round(lw_reg_slope), 'characters per period')\r\nprint('Intercept of Regression Line:', np.round(lw_reg_intercept), 'characters')\r\nlw_errors(lw_reg_slope, lw_reg_intercept)<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[2]:<\/td>\n<td style=\"text-align: left\">Slope of Regression Line: 87.0 characters per period<br \/>\nIntercept of Regression Line: 4745.0 characters<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\"><img decoding=\"async\" class=\"alignnone size-full wp-image-833\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-3.png\" alt=\"\" width=\"394\" height=\"343\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-3.png 394w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-3-300x261.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-3-368x320.png 368w\" sizes=\"(max-width: 394px) 100vw, 394px\" \/><\/p>\n<p style=\"text-align: justify\">Se tiv\u00e9ssemos usado uma linha diferente para criar nossas estimativas, os erros teriam sido diferentes. O gr\u00e1fico abaixo mostra qu\u00e3o grandes seriam os erros se us\u00e1ssemos outra linha para estimativa. O segundo gr\u00e1fico mostra grandes erros obtidos usando uma linha, isso \u00e9 totalmente bobo.<\/p>\n<pre><code><span style=\"color: black\">lw_errors(50, 10000)<\/span><\/code><\/pre>\n<p style=\"text-align: justify\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-834\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-4.png\" alt=\"\" width=\"394\" height=\"342\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-4.png 394w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-4-300x260.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-4-369x320.png 369w\" sizes=\"(max-width: 394px) 100vw, 394px\" \/><\/p>\n<pre><code><span style=\"color: black\">lw_errors(-100, 50000)<\/span><\/code><\/pre>\n<p style=\"text-align: justify\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-835\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-5.png\" alt=\"\" width=\"394\" height=\"342\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-5.png 394w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-5-300x260.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-5-369x320.png 369w\" sizes=\"(max-width: 394px) 100vw, 394px\" \/><\/p>\n<h2 id=\"erro-quadr-tico-m-dio\" style=\"text-align: justify\">Erro Quadr\u00e1tico M\u00e9dio<\/h2>\n<p style=\"text-align: justify\">O que precisamos agora \u00e9 de uma medida geral do tamanho aproximado dos erros. Voc\u00ea reconhecer\u00e1 a abordagem para criar isso &#8211; \u00e9 exatamente a maneira como desenvolvemos o SD.<\/p>\n<p style=\"text-align: justify\">Se voc\u00ea usar qualquer linha arbitr\u00e1ria para calcular suas estimativas, ent\u00e3o alguns de seus erros provavelmente ser\u00e3o positivos e outros negativos. Para evitar cancelamento ao medir o tamanho aproximado dos erros, tomaremos a m\u00e9dia dos erros quadrados em vez da m\u00e9dia dos pr\u00f3prios erros.<\/p>\n<p style=\"text-align: justify\">O erro quadr\u00e1tico m\u00e9dio de estima\u00e7\u00e3o \u00e9 uma medida de qu\u00e3o grandes s\u00e3o os erros quadrados, mas, como observamos anteriormente, suas unidades s\u00e3o dif\u00edceis de interpretar. Tirar a raiz quadrada resulta no erro quadr\u00e1tico m\u00e9dio da raiz (rmse), que est\u00e1 nas mesmas unidades da vari\u00e1vel sendo prevista e, portanto, muito mais f\u00e1cil de entender.<\/p>\n<h2 id=\"minimiza-o-do-erro-quadr-tico-m-dio\" style=\"text-align: justify\">Minimiza\u00e7\u00e3o do Erro Quadr\u00e1tico M\u00e9dio<\/h2>\n<p style=\"text-align: justify\">Nossas observa\u00e7\u00f5es at\u00e9 agora podem ser resumidas da seguinte forma.<\/p>\n<ul style=\"text-align: justify\">\n<li>Para obter estimativas de y com base em x, voc\u00ea pode usar qualquer linha que desejar.<\/li>\n<li>Toda linha tem um erro quadr\u00e1tico m\u00e9dio de estima\u00e7\u00e3o.<\/li>\n<li>Linhas &#8220;melhores&#8221; t\u00eam erros menores.<\/li>\n<\/ul>\n<p style=\"text-align: justify\">Existe uma linha &#8220;melhor&#8221;? Isto \u00e9, existe uma linha que minimiza o erro quadr\u00e1tico m\u00e9dio entre todas as linhas?<\/p>\n<p style=\"text-align: justify\">Para responder a essa pergunta, come\u00e7aremos definindo uma fun\u00e7\u00e3o <code>lw_rmse<\/code> para calcular o erro quadr\u00e1tico m\u00e9dio de qualquer linha atrav\u00e9s do diagrama de dispers\u00e3o de Little Women. A fun\u00e7\u00e3o recebe a inclina\u00e7\u00e3o e a intercepta\u00e7\u00e3o (nessa ordem) como seus argumentos.<\/p>\n<pre><code><span style=\"color: black\">def lw_rmse(slope, intercept):\r\n    lw_errors(slope, intercept)\r\n    x = little_women.column('Periods')\r\n    y = little_women.column('Characters')\r\n    fitted = slope * x + intercept\r\n    mse = np.mean((y - fitted) ** 2)\r\n    print(\"Root mean squared error:\", mse ** 0.5)<\/span><\/code><\/pre>\n<pre><code><span style=\"color: black\">lw_rmse(50, 10000)<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[3]:<\/td>\n<td style=\"text-align: left\">Root mean squared error: 4322.167831766537<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-836\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-6.png\" alt=\"\" width=\"394\" height=\"342\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-6.png 394w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-6-300x260.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-6-369x320.png 369w\" sizes=\"(max-width: 394px) 100vw, 394px\" \/><\/p>\n<pre><code><span style=\"color: black\">lw_rmse(-100, 50000)<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[4]:<\/td>\n<td style=\"text-align: left\">Root mean squared error: 16710.11983735375<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-837\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-7.png\" alt=\"\" width=\"394\" height=\"342\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-7.png 394w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-7-300x260.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-7-369x320.png 369w\" sizes=\"(max-width: 394px) 100vw, 394px\" \/><\/p>\n<p style=\"text-align: justify\">Linhas ruins t\u00eam valores grandes de rmse, como esperado. Mas o rmse \u00e9 muito menor se escolhermos uma inclina\u00e7\u00e3o e intercepta\u00e7\u00e3o pr\u00f3xima daquelas da linha de regress\u00e3o.<\/p>\n<pre><code><span style=\"color: black\">lw_rmse(90, 4000)<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[5]:<\/td>\n<td style=\"text-align: left\">Root mean squared error: 2715.5391063834586<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-838\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-8.png\" alt=\"\" width=\"394\" height=\"342\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-8.png 394w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-8-300x260.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-8-369x320.png 369w\" sizes=\"(max-width: 394px) 100vw, 394px\" \/><\/p>\n<p style=\"text-align: justify\">Aqui est\u00e1 a raiz do erro quadr\u00e1tico m\u00e9dio correspondente \u00e0 linha de regress\u00e3o. Por um fato not\u00e1vel da matem\u00e1tica, nenhuma outra linha pode superar esta.<\/p>\n<ul style=\"text-align: justify\">\n<li><strong>A linha de regress\u00e3o \u00e9 a \u00fanica linha reta que minimiza o erro quadr\u00e1tico m\u00e9dio de estimativa entre todas as linhas retas.<\/strong><\/li>\n<\/ul>\n<pre><code><span style=\"color: black\">lw_rmse(lw_reg_slope, lw_reg_intercept)<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[6]:<\/td>\n<td style=\"text-align: left\">Root mean squared error: 2701.690785311856<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-839\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-9.png\" alt=\"\" width=\"394\" height=\"343\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-9.png 394w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-9-300x261.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-9-368x320.png 368w\" sizes=\"(max-width: 394px) 100vw, 394px\" \/><\/p>\n<p style=\"text-align: justify\">A prova desta afirma\u00e7\u00e3o requer matem\u00e1tica abstrata que est\u00e1 al\u00e9m do escopo deste curso. Por outro lado, temos uma ferramenta poderosa \u2013 Python \u2013 que realiza grandes c\u00e1lculos num\u00e9ricos com facilidade. Portanto, podemos usar Python para confirmar que a linha de regress\u00e3o minimiza o erro quadr\u00e1tico m\u00e9dio.<\/p>\n<h3 id=\"otimiza-o-num-rica\" style=\"text-align: justify\">Otimiza\u00e7\u00e3o Num\u00e9rica<\/h3>\n<p style=\"text-align: justify\">Primeiro, note que uma linha que minimiza o erro quadr\u00e1tico m\u00e9dio (rqm) tamb\u00e9m \u00e9 uma linha que minimiza o erro quadr\u00e1tico. A raiz quadrada n\u00e3o faz diferen\u00e7a para a minimiza\u00e7\u00e3o. Portanto, vamos economizar um passo de c\u00e1lculo e apenas minimizar o erro quadr\u00e1tico m\u00e9dio (eqm).<\/p>\n<p style=\"text-align: justify\">Estamos tentando prever o n\u00famero de caracteres (y) com base no n\u00famero de per\u00edodos (x) nos cap\u00edtulos de &#8216;Little Women&#8217;. Se usarmos a linha<\/p>\n<p>&nbsp;<\/p>\n<div style=\"text-align: center;font-family: serif;font-size: 2.2em\">previs\u00e3o = ax + b<\/div>\n<p>&nbsp;<\/p>\n<p style=\"text-align: justify\">ela ter\u00e1 um eqm que depende da inclina\u00e7\u00e3o a e da intercepta\u00e7\u00e3o b. A fun\u00e7\u00e3o <code>lw_mse<\/code> recebe a inclina\u00e7\u00e3o e a intercepta\u00e7\u00e3o como argumentos e retorna o eqm correspondente.<\/p>\n<pre><code><span style=\"color: black\">def lw_mse(any_slope, any_intercept):\r\n    x = little_women.column('Periods')\r\n    y = little_women.column('Characters')\r\n    fitted = any_slope*x + any_intercept\r\n    return np.mean((y - fitted) ** 2)<\/span><\/code><\/pre>\n<p style=\"text-align: justify\">Vamos verificar se <code>lw_mse<\/code> obt\u00e9m a resposta correta para o erro quadr\u00e1tico m\u00e9dio da linha de regress\u00e3o. Lembre-se de que <code>lw_mse<\/code> retorna o erro quadr\u00e1tico m\u00e9dio, ent\u00e3o temos que tirar a raiz quadrada para obter o rqm.<\/p>\n<pre><code><span style=\"color: black\">lw_mse(lw_reg_slope, lw_reg_intercept)**0.5<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[7]:<\/td>\n<td style=\"text-align: left\">2701.690785311856<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">\u00c9 o mesmo valor que obtivemos usando <code>lw_rmse<\/code> mais cedo:<\/p>\n<pre><code><span style=\"color: black\">lw_rmse(lw_reg_slope, lw_reg_intercept)<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[8]:<\/td>\n<td style=\"text-align: left\">Root mean squared error: 2701.690785311856<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-840\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-10.png\" alt=\"\" width=\"394\" height=\"343\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-10.png 394w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-10-300x261.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/15-3-10-368x320.png 368w\" sizes=\"(max-width: 394px) 100vw, 394px\" \/><\/p>\n<p style=\"text-align: justify\">Voc\u00ea pode confirmar que <code>lw_mse<\/code> tamb\u00e9m retorna o valor correto para outras inclina\u00e7\u00f5es e intercepta\u00e7\u00f5es. Por exemplo, aqui est\u00e1 o resultado da linha extremamente ruim que tentamos anteriormente.<\/p>\n<pre><code><span style=\"color: black\">lw_mse(-100, 50000)**0.5<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[9]:<\/td>\n<td style=\"text-align: left\">16710.11983735375<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">E aqui est\u00e1 o resultado de uma linha que est\u00e1 pr\u00f3xima da linha de regress\u00e3o.<\/p>\n<pre><code><span style=\"color: black\">lw_mse(90, 4000)**0.5<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[10]:<\/td>\n<td style=\"text-align: left\">2715.5391063834586<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">Se experimentarmos diferentes valores, podemos encontrar uma inclina\u00e7\u00e3o e intercepta\u00e7\u00e3o de baixo erro por tentativa e erro, mas isso levaria um tempo. Felizmente, existe uma fun\u00e7\u00e3o Python que faz todas as tentativas e erros para n\u00f3s.<\/p>\n<p style=\"text-align: justify\">A fun\u00e7\u00e3o <code>minimize<\/code> pode ser usada para encontrar os argumentos de uma fun\u00e7\u00e3o para os quais a fun\u00e7\u00e3o retorna seu valor m\u00ednimo. O Python usa uma abordagem semelhante de tentativa e erro, seguindo as mudan\u00e7as que levam a valores de sa\u00edda incrementalmente menores.<\/p>\n<p style=\"text-align: justify\">O argumento de <code>minimize<\/code> \u00e9 uma fun\u00e7\u00e3o que, por sua vez, aceita argumentos num\u00e9ricos e retorna um valor num\u00e9rico. Por exemplo, a fun\u00e7\u00e3o <code>lw_mse<\/code> aceita uma inclina\u00e7\u00e3o num\u00e9rica e uma intercepta\u00e7\u00e3o como seus argumentos e retorna o mse correspondente.<\/p>\n<p style=\"text-align: justify\">A chamada <code>minimize(lw_mse)<\/code> retorna um array que consiste na inclina\u00e7\u00e3o e na intercepta\u00e7\u00e3o que minimizam o mse. Esses valores minimizantes s\u00e3o aproxima\u00e7\u00f5es excelentes alcan\u00e7adas por tentativa e erro inteligente, n\u00e3o valores exatos baseados em f\u00f3rmulas.<\/p>\n<pre><code><span style=\"color: black\">best = minimize(lw_mse)\r\nbest<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[11]:<\/td>\n<td style=\"text-align: left\">array([ 86.97784117, 4744.78484535])<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">Esses valores s\u00e3o os mesmos que calculamos anteriormente usando as fun\u00e7\u00f5es <code>slope<\/code> e <code>intercept<\/code>. Vemos pequenas varia\u00e7\u00f5es devido \u00e0 natureza inexata do <code>minimize<\/code>, mas os valores s\u00e3o essencialmente os mesmos.<\/p>\n<pre><code><span style=\"color: black\">print(\"slope from formula:        \", lw_reg_slope)\r\nprint(\"slope from minimize:       \", best.item(0))\r\nprint(\"intercept from formula:    \", lw_reg_intercept)\r\nprint(\"intercept from minimize:   \", best.item(1))<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[12]:<\/td>\n<td style=\"text-align: left\">slope from formula: 86.97784125829821<br \/>\nslope from minimize: 86.97784116615884<br \/>\nintercept from formula: 4744.784796574928<br \/>\nintercept from minimize: 4744.784845352655<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2 id=\"a-linha-dos-m-nimos-quadrados\" style=\"text-align: justify\">A Linha dos M\u00ednimos Quadrados<\/h2>\n<p style=\"text-align: justify\">Portanto, descobrimos n\u00e3o apenas que a linha de regress\u00e3o minimiza o erro quadr\u00e1tico m\u00e9dio, mas tamb\u00e9m que minimizar o erro quadr\u00e1tico m\u00e9dio nos d\u00e1 a linha de regress\u00e3o. A linha de regress\u00e3o \u00e9 a \u00fanica linha que minimiza o erro quadr\u00e1tico m\u00e9dio.<\/p>\n<p style=\"text-align: justify\">\u00c9 por isso que a linha de regress\u00e3o \u00e0s vezes \u00e9 chamada de &#8220;linha dos m\u00ednimos quadrados.&#8221;<\/p>\n<p>&nbsp;<\/p>\n<p><!--###########################################################################################################################################################--><\/p>\n<table width=\"100%\">\n<tbody>\n<tr>\n<td align=\"left\"><a class=\"next-page-link\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/15-0\/15-2\/\">\u2190 Cap\u00edtulo 15.2 &#8211; Linha de Regress\u00e3o<\/a><\/td>\n<td align=\"right\"><a class=\"next-page-link\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/15-0\/15-4\/\">Cap\u00edtulo 15.4 &#8211; Regress\u00e3o de M\u00ednimos Quadrados \u2192<\/a><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><!--###########################################################################################################################################################--><\/p>\n<\/div>\n<\/div>\n<div style=\"clear: both;height: 1px;margin-top: -1px\"><\/div>\n","protected":false},"excerpt":{"rendered":"<p>\u00cdndice 1. O que \u00e9 Ci\u00eancia de Dados? 1.1. Introdu\u00e7\u00e3o 1.1.1. Ferramentas Computacionais 1.1.2. T\u00e9cnicas Estat\u00edsticas 1.2. Por que Ci\u00eancia de Dados? 1.3. Tra\u00e7ando os Cl\u00e1ssicos 1.3.1. Personagens Liter\u00e1rios 1.3.2. Outro Tipo de Personagem 2. Causalidade e Experimentos 2.1. John Snow e a Bomba da Broad Street 2.2. O &#8220;Grande Experimento&#8221; de Snow 2.3. Estabelecendo [&hellip;]<\/p>\n","protected":false},"author":21894,"featured_media":0,"parent":787,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"page-templates\/full-width.php","meta":{"footnotes":""},"coauthors":[14],"class_list":["post-829","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/pages\/829","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/users\/21894"}],"replies":[{"embeddable":true,"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/comments?post=829"}],"version-history":[{"count":5,"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/pages\/829\/revisions"}],"predecessor-version":[{"id":1076,"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/pages\/829\/revisions\/1076"}],"up":[{"embeddable":true,"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/pages\/787"}],"wp:attachment":[{"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/media?parent=829"}],"wp:term":[{"taxonomy":"author","embeddable":true,"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/coauthors?post=829"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}