{"id":696,"date":"2025-07-28T20:25:23","date_gmt":"2025-07-29T00:25:23","guid":{"rendered":"https:\/\/literaciadigital.ufms.br\/?page_id=696"},"modified":"2025-10-11T03:02:45","modified_gmt":"2025-10-11T07:02:45","slug":"13-2","status":"publish","type":"page","link":"https:\/\/literaciadigital.ufms.br\/en\/data8\/13-0\/13-2\/","title":{"rendered":"Cap\u00edtulo 13.2"},"content":{"rendered":"<div style=\"position: relative\">\n<div style=\"float: left;width: 300px;background-color: #f5f5f5;border: 1px solid #ddd;border-radius: 5px;padding: 15px;margin-right: 20px;margin-bottom: 5px;overflow: hidden\">\n<h3 style=\"margin: 0 0 10px 0;padding-bottom: 8px;border-bottom: 1px solid #ddd\">\u00cdndice<\/h3>\n<ol style=\"margin: 0;padding-left: 0;list-style-type: none\">\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/1-0\/\">1. O que \u00e9 Ci\u00eancia de Dados?<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/1-0\/1-1\/\">1.1. Introdu\u00e7\u00e3o<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/1-0\/1-1\/1-1\/\">1.1.1. Ferramentas Computacionais<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/1-0\/1-1\/1-2\/\">1.1.2. T\u00e9cnicas Estat\u00edsticas<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/1-0\/1-2\/\">1.2. Por que Ci\u00eancia de Dados?<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/1-0\/1-3\/\">1.3. Tra\u00e7ando os Cl\u00e1ssicos<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/1-0\/1-3\/3-1\/\">1.3.1. Personagens Liter\u00e1rios<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/1-0\/1-3\/3-2\/\">1.3.2. Outro Tipo de Personagem<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/2-0\/\">2. Causalidade e Experimentos<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/2-0\/2-1\/\">2.1. John Snow e a Bomba da Broad Street<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/2-0\/2-2\/\">2.2. O &#8220;Grande Experimento&#8221; de Snow<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/2-0\/2-3\/\">2.3. Estabelecendo Causalidade<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/2-0\/2-4\/\">2.4. Randomiza\u00e7\u00e3o<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/2-0\/2-5\/\">2.5. Notas Finais<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/3-0\/\">3. Progamando em Python<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/3-0\/3-1\/\">3.1. Express\u00f5es<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/3-0\/3-2\/\">3.2. Nomes<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/3-0\/3-2\/2-1\/\">3.2.1. Exemplo: Taxas de Crescimento<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/3-0\/3-3\/\">3.3. Chamadas<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/3-0\/3-4\/\">3.4. Introdu\u00e7\u00e3o \u00e0s Tabelas<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/4-0\/\">4. Tipos de Dados<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/4-0\/4-1\/\">4.1. N\u00fameros<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/4-0\/4-2\/\">4.2. Strings<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/4-0\/4-2\/2-1\/\">4.2.1. M\u00e9todos de Strings<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/4-0\/4-3\/\">4.3. Compara\u00e7\u00f5es<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/5-0\/\">5. Sequ\u00eancias<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/5-0\/5-1\/\">5.1. Arrays<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/5-0\/5-2\/\">5.2. Ranges<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/5-0\/5-3\/\">5.3. Mais sobre Arrays<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/6-0\/\">6. Tabelas<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/6-0\/6-1\/\">6.1. Ordenando Linhas<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/6-0\/6-2\/\">6.2. Selecionando Linhas<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/6-0\/6-3\/\">6.3. Exemplo: Tend\u00eancias Populacionais<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/6-0\/6-4\/\">6.4. Examplo: Propor\u00e7\u00f5es de Sexos<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/7-0\/\">7. Visualiza\u00e7\u00e3o<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/7-0\/7-1\/\">7.1. Visualizando Distribui\u00e7\u00f5es<br \/>\nCateg\u00f3ricas<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/7-0\/7-2\/\">7.2. Visualizando Distribui\u00e7\u00f5es Num\u00e9ricas<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/7-0\/7-3\/\">7.3. Gr\u00e1ficos Sobrepostos<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/8-0\/\">8. Fun\u00e7\u00f5es e Tabelas<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/8-0\/8-1\/\">8.1. Aplicando Fun\u00e7\u00e3o a uma Coluna<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/8-0\/8-2\/\">8.2. Classificando por uma Vari\u00e1vel<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/8-0\/8-3\/\">8.3. Classifica\u00e7\u00e3o Cruzada<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/8-0\/8-4\/\">8.4. Unindo Tabelas por Colunas<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/8-0\/8-5\/\">8.5. Compartilhamento de Bicicletas<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/9-0\/\">9. Aleatoriedade<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/9-0\/9-1\/\">9.1. Declara\u00e7\u00f5es Condicionais<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/9-0\/9-2\/\">9.2. Itera\u00e7\u00e3o<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/9-0\/9-3\/\">9.3. Simula\u00e7\u00e3o<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/9-0\/9-4\/\">9.4. O Problema de Monty Hall<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/9-0\/9-5\/\">9.5. Encontrando Probabilidades<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/10-0\/\">10. Amostragem e Distribui\u00e7\u00f5es Emp\u00edricas<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/10-0\/10-1\/\">10.1. Distribui\u00e7\u00f5es Emp\u00edricas<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/10-0\/10-2\/\">10.2. Amostragem de uma Popula\u00e7\u00e3o<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/10-0\/10-3\/\">10.3. Distribui\u00e7\u00e3o Emp\u00edrica de uma<br \/>\nEstat\u00edstica<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/10-0\/10-4\/\">10.4. Amostragem Aleat\u00f3ria em Python <\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/11-0\/\">11. Testando Hip\u00f3teses<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/11-0\/11-1\/\">11.1. Avaliando um Modelo<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/11-0\/11-2\/\">11.2. M\u00faltiplas Categorias<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/11-0\/11-3\/\">11.3. Decis\u00f5es e Incertezas<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/11-0\/11-4\/\">11.4. Probabilidades de Erro<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/12-0\/\">12. Comparando Duas Amostras<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/12-0\/12-1\/\">12.1. Teste A\/B<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/12-0\/12-2\/\">12.2. Causalidade<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/12-0\/12-3\/\">12.3. Esvaziar<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/13-0\/\">13. Estima\u00e7\u00e3o<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/13-0\/13-1\/\">13.1. Percentis<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/13-0\/13-2\/\">13.2. O Bootstrap<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/13-0\/13-3\/\">13.3. Intervalos de Confian\u00e7a<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/13-0\/13-4\/\">13.4. Usando Intervalos de Confian\u00e7a<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/14-0\/\">14. Por que a M\u00e9dia \u00e9 Importante<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/14-0\/14-1\/\">14.1. Propriedades da M\u00e9dia<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/14-0\/14-2\/\">14.2. Variabilidade<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/14-0\/14-3\/\">14.3. O DP e a Curva Normal<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/14-0\/14-4\/\">14.4. Teorema Central do Limite<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/14-0\/14-5\/\">14.5. Variabilidade da M\u00e9dia da Amostra<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/14-0\/14-6\/\">14.6. Escolhendo um Tamanho de Amostra<\/a><\/li>\n<\/ul>\n<\/li>\n<li style=\"margin-bottom: 5px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/15-0\/\">15. Previs\u00e3o<\/a>\n<ul style=\"margin: 5px 0 5px 15px;padding-left: 10px;list-style-type: none;border-left: 1px solid #ddd\">\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/15-0\/15-1\/\">15.1. Correla\u00e7\u00e3o<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/15-0\/15-2\/\">15.2. Linha de Regress\u00e3o<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/15-0\/15-3\/\">15.3. M\u00e9todo dos M\u00ednimos Quadrados<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/15-0\/15-4\/\">15.4. Regress\u00e3o de M\u00ednimos Quadrados<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/15-0\/15-5\/\">15.5. Diagn\u00f3sticos Visuais<\/a><\/li>\n<li style=\"margin-bottom: 3px\"><a style=\"padding: 2px 0\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/15-0\/15-6\/\">15.6. Diagn\u00f3stico Num\u00e9rico<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<\/div>\n<p><!-- Main Content --><\/p>\n<div style=\"overflow: hidden\">\n<p><!--###########################################################################################################################################################--><\/p>\n<pre><code><span style=\"color: black\">from datascience import *\r\n%matplotlib inline\r\npath_data = '..\/..\/..\/assets\/data\/'\r\nimport matplotlib.pyplot as plots\r\nplots.style.use('fivethirtyeight')\r\nimport numpy as np<\/span><\/code><\/pre>\n<p>&nbsp;<\/p>\n<h1 id=\"o-bootstrap\" style=\"text-align: center\">O Bootstrap<\/h1>\n<p style=\"text-align: justify\">Uma cientista de dados est\u00e1 usando os dados de uma amostra aleat\u00f3ria para estimar um par\u00e2metro desconhecido. Ela utiliza a amostra para calcular o valor de uma estat\u00edstica que ser\u00e1 sua estimativa.<\/p>\n<p style=\"text-align: justify\">Uma vez que ela calculou o valor observado da sua estat\u00edstica, ela poderia simplesmente apresent\u00e1-lo como sua estimativa e seguir em frente. Mas ela \u00e9 uma cientista de dados. Ela sabe que sua amostra aleat\u00f3ria \u00e9 apenas uma das muitas poss\u00edveis amostras aleat\u00f3rias e, portanto, sua estimativa \u00e9 apenas uma das v\u00e1rias estimativas plaus\u00edveis.<\/p>\n<p style=\"text-align: justify\">Em quanto essas estimativas podem variar? Para responder a isso, parece que ela precisa tirar outra amostra da popula\u00e7\u00e3o e calcular uma nova estimativa com base na nova amostra. Mas ela n\u00e3o tem os recursos para voltar \u00e0 popula\u00e7\u00e3o e tirar outra amostra.<\/p>\n<p style=\"text-align: justify\">Parece que a cientista de dados est\u00e1 presa.<\/p>\n<p style=\"text-align: justify\">Felizmente, uma ideia brilhante chamada <em>bootstrap<\/em> pode ajud\u00e1-la. Como n\u00e3o \u00e9 vi\u00e1vel gerar novas amostras da popula\u00e7\u00e3o, o bootstrap gera novas amostras aleat\u00f3rias por meio de um m\u00e9todo chamado <em>reamostragem<\/em>: as novas amostras s\u00e3o sorteadas aleatoriamente <em>da amostra original<\/em>.<\/p>\n<p style=\"text-align: justify\">Nesta se\u00e7\u00e3o, veremos como e por que o bootstrap funciona. No restante do cap\u00edtulo, usaremos o bootstrap para infer\u00eancia.<\/p>\n<h2 id=\"remunera-o-dos-funcion-rios-na-cidade-de-s-o-francisco\" style=\"text-align: justify\">Remunera\u00e7\u00e3o dos Funcion\u00e1rios na Cidade de S\u00e3o Francisco<\/h2>\n<p style=\"text-align: justify\">O <a href=\"https:\/\/data.sfgov.org\">SF OpenData<\/a> \u00e9 um site onde a cidade e o condado de S\u00e3o Francisco disponibilizam alguns de seus dados publicamente. Um dos conjuntos de dados cont\u00e9m informa\u00e7\u00f5es de remunera\u00e7\u00e3o dos funcion\u00e1rios da cidade. Isso inclui profissionais m\u00e9dicos em hospitais administrados pela cidade, policiais, bombeiros, trabalhadores de transporte, funcion\u00e1rios eleitos e todos os outros funcion\u00e1rios da cidade.<\/p>\n<p style=\"text-align: justify\">Os dados de remunera\u00e7\u00e3o para o ano calend\u00e1rio de 2019 est\u00e3o na tabela <code>sf2019<\/code>.<\/p>\n<pre><code><span style=\"color: black\">sf2019 = Table.read_table(path_data + 'san_francisco_2019.csv')<\/span><\/code><\/pre>\n<pre><code><span style=\"color: black\">sf2019.show(3)<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-collapse: collapse;width: auto;margin-left: 1em\" border=\"1\">\n<thead>\n<tr style=\"background-color: #f0f0f0;border-bottom: 2px solid #ddd\">\n<th style=\"text-align: left;padding: 4px 8px\">Organization Group<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Department<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Job Family<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Job<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Salary<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Overtime<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Benefits<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Total Compensation<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Public Protection<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Adult Probation<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Information Systems<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">IS Trainer-Journey<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">91332<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">40059<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">131391<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Public Protection<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Adult Probation<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Information Systems<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">IS Engineer-Assistant<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">123241<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">49279<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">172520<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Public Protection<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Adult Probation<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Information Systems<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">IS Business Analyst-Senior<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">115715<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">46752<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">162468<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">H\u00e1 uma linha para cada um dos mais de 44.500 funcion\u00e1rios. Existem in\u00fameras colunas contendo informa\u00e7\u00f5es sobre a afilia\u00e7\u00e3o departamental da cidade e detalhes das diferentes partes do pacote de remunera\u00e7\u00e3o do funcion\u00e1rio. Aqui est\u00e1 a linha correspondente a London Breed, prefeito de S\u00e3o Francisco em 2019 .<\/p>\n<pre><code><span style=\"color: black\">sf2019.where('Job', 'Mayor')<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-collapse: collapse;width: auto;margin-left: 1em\" border=\"1\">\n<thead>\n<tr style=\"background-color: #f0f0f0;border-bottom: 2px solid #ddd\">\n<th style=\"text-align: left;padding: 4px 8px\">Organization Group<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Department<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Job Family<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Job<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Salary<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Overtime<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Benefits<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Total Compensation<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">General Administration &amp; Finance<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Mayor<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Administrative &amp; Mgmt (Unrep)<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Mayor<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">342974<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">98012<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">440987<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">Vamos estudar a coluna final, <code>Total Compensation<\/code>. Isso representa o sal\u00e1rio do funcion\u00e1rio mais a contribui\u00e7\u00e3o da cidade para seus planos de aposentadoria e benef\u00edcios.<\/p>\n<p style=\"text-align: justify\">Os pacotes financeiros em um ano calend\u00e1rio \u00e0s vezes podem ser dif\u00edceis de entender, pois dependem da data de contrata\u00e7\u00e3o, se o funcion\u00e1rio est\u00e1 mudando de emprego dentro da cidade, e assim por diante. Por exemplo, os valores mais baixos na coluna <code>Total Compensation<\/code> parecem um pouco estranhos.<\/p>\n<pre><code><span style=\"color: black\">sf2019.sort('Total Compensation')<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-collapse: collapse;width: auto;margin-left: 1em\" border=\"1\">\n<thead>\n<tr style=\"background-color: #f0f0f0;border-bottom: 2px solid #ddd\">\n<th style=\"text-align: left;padding: 4px 8px\">Organization Group<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Department<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Job Family<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Job<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Salary<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Overtime<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Benefits<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Total Compensation<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Public Protection<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Adult Probation<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Probation &amp; Parole<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Deputy Probation Officer<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Public Protection<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Fire Department<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Clerical, Secretarial &amp; Steno<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Senior Clerk Typist<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Public Protection<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Juvenile Court<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Correction &amp; Detention<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Counselor, Juvenile Hall PERS<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Public Protection<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Police<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Clerical, Secretarial &amp; Steno<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Clerk Typist<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Public Protection<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Sheriff<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Correction &amp; Detention<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Deputy Sheriff<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Public Works, Transportation &amp; Commerce<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Airport Commission<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Sub-Professional Engineering<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">StdntDsgn Train2\/Arch\/Eng\/Plng<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Public Works, Transportation &amp; Commerce<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Airport Commission<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Clerical, Secretarial &amp; Steno<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Executive Secretary 1<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Public Works, Transportation &amp; Commerce<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Airport Commission<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Payroll, Billing &amp; Accounting<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Senior Account Clerk<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Public Works, Transportation &amp; Commerce<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Airport Commission<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Housekeeping &amp; Laundry<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Custodian<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Public Works, Transportation &amp; Commerce<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Airport Commission<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Housekeeping &amp; Laundry<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Custodian<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">Para maior clareza de interpreta\u00e7\u00e3o, focaremos nossa aten\u00e7\u00e3o naqueles que tiveram aproximadamente o equivalente a um emprego de meio per\u00edodo ou mais durante todo o ano. Com um sal\u00e1rio m\u00ednimo de cerca de 15 d\u00f3lares por hora e 20 horas por semana durante 52 semanas , isso \u00e9 um sal\u00e1rio de mais de 15.000 d\u00f3lares.<\/p>\n<pre><code><span style=\"color: black\">sf2019 = sf2019.where('Salary', are.above(15000))<\/span><\/code><\/pre>\n<pre><code><span style=\"color: black\">sf2019.num_rows<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[1]:<\/td>\n<td style=\"text-align: left\">37103<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2 id=\"popula-o-e-par-metro\" style=\"text-align: justify\">Popula\u00e7\u00e3o e Par\u00e2metro<\/h2>\n<p style=\"text-align: justify\">Deixe esta tabela de pouco mais de 37.000 linhas ser a nossa popula\u00e7\u00e3o. Aqui est\u00e1 um histograma das remunera\u00e7\u00f5es totais dos funcion\u00e1rios nesta tabela.<\/p>\n<pre><code><span style=\"color: black\">sf_bins = np.arange(0, 726000, 25000)\r\nsf2019.select('Total Compensation').hist(bins=sf_bins)<\/span><\/code><\/pre>\n<p style=\"text-align: justify\"><img fetchpriority=\"high\" decoding=\"async\" class=\"alignnone size-full wp-image-697\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-1.png\" alt=\"\" width=\"464\" height=\"323\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-1.png 464w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-1-300x209.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-1-460x320.png 460w\" sizes=\"(max-width: 464px) 100vw, 464px\" \/><\/p>\n<p style=\"text-align: justify\">Embora a maioria dos valores esteja abaixo de 300.000 d\u00f3lares, alguns s\u00e3o um pouco mais altos. Por exemplo, a remunera\u00e7\u00e3o total do Diretor de Investimentos foi superior a 700.000 d\u00f3lares. \u00c9 por isso que o eixo horizontal se estende bastante \u00e0 direita das barra vis\u00edveis.<\/p>\n<pre><code><span style=\"color: black\">sf2019.sort('Total Compensation', descending=True).show(2)<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-collapse: collapse;width: auto;margin-left: 1em\" border=\"1\">\n<thead>\n<tr style=\"background-color: #f0f0f0;border-bottom: 2px solid #ddd\">\n<th style=\"text-align: left;padding: 4px 8px\">Organization Group<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Department<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Job Family<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Job<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Salary<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Overtime<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Benefits<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Total Compensation<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">General Administration &amp; Finance<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Retirement Services<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Administrative &amp; Mgmt (Unrep)<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Chief Investment Officer<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">577633<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">146398<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">724031<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">General Administration &amp; Finance<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Retirement Services<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Unassigned<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">Managing Director<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">483072<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">0<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">134879<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">617951<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">Suponha que o par\u00e2metro que nos interessa seja a mediana das remunera\u00e7\u00f5es totais.<\/p>\n<p style=\"text-align: justify\">Como podemos nos dar ao luxo de ter todos os dados da popula\u00e7\u00e3o, podemos simplesmente calcular o par\u00e2metro:<\/p>\n<pre><code><span style=\"color: black\">pop_median = percentile(50, sf2019.column('Total Compensation'))\r\npop_median<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[2]:<\/td>\n<td style=\"text-align: left\">135747.0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">A mediana da compensa\u00e7\u00e3o total de todos os funcion\u00e1rios foi de 135.747 d\u00f3lares.<\/p>\n<p style=\"text-align: justify\">Do ponto de vista pr\u00e1tico, n\u00e3o h\u00e1 motivo para tirarmos uma amostra para estimar esse par\u00e2metro, pois simplesmente conhecemos o seu valor. Mas nesta se\u00e7\u00e3o, vamos fingir que n\u00e3o sabemos o valor e ver qu\u00e3o bem podemos estim\u00e1-lo com base em uma amostra aleat\u00f3ria.<\/p>\n<p style=\"text-align: justify\">Nas se\u00e7\u00f5es seguintes, voltaremos \u00e0 realidade e trabalharemos em situa\u00e7\u00f5es onde o par\u00e2metro \u00e9 desconhecido. Por enquanto, somos oniscientes.<\/p>\n<h2 id=\"uma-amostra-aleat-ria-e-uma-estimativa\" style=\"text-align: justify\">Uma Amostra Aleat\u00f3ria e uma Estimativa<\/h2>\n<p style=\"text-align: justify\">Vamos sortear uma amostra de 500 funcion\u00e1rios aleatoriamente, sem reposi\u00e7\u00e3o, e a mediana da compensa\u00e7\u00e3o total dos funcion\u00e1rios amostrados servir\u00e1 como nossa estimativa do par\u00e2metro.<\/p>\n<pre><code>our_sample = sf2019.sample(500, with_replacement=False)\r\nour_sample.select('Total Compensation').hist(bins=sf_bins)<\/code><\/pre>\n<p style=\"text-align: justify\"><img decoding=\"async\" class=\"alignnone size-full wp-image-698\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-2.png\" alt=\"\" width=\"464\" height=\"327\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-2.png 464w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-2-300x211.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-2-454x320.png 454w\" sizes=\"(max-width: 464px) 100vw, 464px\" \/><\/p>\n<pre><code><span style=\"color: black\">est_median = percentile(50, our_sample.column('Total Compensation'))\r\nest_median<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[3]:<\/td>\n<td style=\"text-align: left\">136835.0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">O tamanho da amostra \u00e9 grande. Pela lei das m\u00e9dias, a distribui\u00e7\u00e3o da amostra se assemelha \u00e0 da popula\u00e7\u00e3o. Consequentemente, a mediana da amostra \u00e9 bastante compar\u00e1vel \u00e0 mediana da popula\u00e7\u00e3o, embora, claro, n\u00e3o seja exatamente a mesma.<\/p>\n<p style=\"text-align: justify\">Ent\u00e3o agora temos uma estimativa do par\u00e2metro. Mas se a amostra tivesse sido diferente, a estimativa teria um valor diferente. Gostar\u00edamos de poder quantificar o quanto a estimativa poderia variar entre as amostras. Essa medida de variabilidade nos ajudar\u00e1 a medir qu\u00e3o precisamente podemos estimar o par\u00e2metro.<\/p>\n<p style=\"text-align: justify\">Para ver qu\u00e3o diferente a estimativa seria se a amostra tivesse sido diferente, poder\u00edamos simplesmente tirar outra amostra da popula\u00e7\u00e3o. Mas isso seria trapa\u00e7a. Estamos tentando imitar a vida real, em que n\u00e3o teremos todos os dados da popula\u00e7\u00e3o \u00e0 m\u00e3o.<\/p>\n<p style=\"text-align: justify\">De alguma forma, temos que obter outra amostra aleat\u00f3ria <em>sem amostrar novamente da popula\u00e7\u00e3o<\/em>.<\/p>\n<h2 id=\"o-bootstrap-reamostragem-a-partir-da-amostra\" style=\"text-align: justify\">O Bootstrap: Reamostragem a Partir da Amostra<\/h2>\n<p style=\"text-align: justify\">O que temos \u00e9 uma grande amostra aleat\u00f3ria da popula\u00e7\u00e3o. Como sabemos, uma grande amostra aleat\u00f3ria provavelmente se assemelha \u00e0 popula\u00e7\u00e3o da qual foi retirada. Esta observa\u00e7\u00e3o permite que cientistas de dados <em>se levantem pelos pr\u00f3prios cadar\u00e7os<\/em>: o procedimento de amostragem pode ser replicado <em>amostrando a partir da amostra<\/em>.<\/p>\n<p style=\"text-align: justify\">Aqui est\u00e3o os passos do <em>m\u00e9todo bootstrap<\/em> para gerar outra amostra aleat\u00f3ria que se assemelha \u00e0 popula\u00e7\u00e3o:<\/p>\n<ul style=\"text-align: justify\">\n<li><strong>Trate a amostra original como se fosse a popula\u00e7\u00e3o.<\/strong><\/li>\n<li><strong>Tire da amostra, aleatoriamente com reposi\u00e7\u00e3o, o mesmo n\u00famero de vezes que o tamanho da amostra original<\/strong>.<\/li>\n<\/ul>\n<p style=\"text-align: justify\">\u00c9 importante reamostrar o mesmo n\u00famero de vezes que o tamanho da amostra original. A raz\u00e3o \u00e9 que a variabilidade de uma estimativa depende do tamanho da amostra. Como nossa amostra original consistia de 500 funcion\u00e1rios, nossa mediana da amostra foi baseada em 500 valores. Para ver qu\u00e3o diferente a amostra poderia ter sido, temos que compar\u00e1-la \u00e0 mediana de outras amostras de tamanho 500.<\/p>\n<p style=\"text-align: justify\">Se tir\u00e1ssemos 500 vezes aleatoriamente <em>sem<\/em> reposi\u00e7\u00e3o de nossa amostra de 500, simplesmente ter\u00edamos a mesma amostra de volta. Ao tirar <em>com<\/em> reposi\u00e7\u00e3o, criamos a possibilidade de que as novas amostras sejam diferentes da original, porque alguns funcion\u00e1rios podem ser escolhidos mais de uma vez e outros n\u00e3o.<\/p>\n<h2 id=\"por-que-o-bootstrap-funciona\" style=\"text-align: justify\">Por que o Bootstrap Funciona<\/h2>\n<p style=\"text-align: justify\">Por que isso \u00e9 uma boa ideia? Pela lei das m\u00e9dias, a distribui\u00e7\u00e3o da amostra original provavelmente se assemelha \u00e0 popula\u00e7\u00e3o, e as distribui\u00e7\u00f5es de todas as &#8220;reamostras&#8221; provavelmente se assemelham \u00e0 amostra original. Portanto, as distribui\u00e7\u00f5es de todas as reamostras provavelmente se assemelham \u00e0 popula\u00e7\u00e3o tamb\u00e9m.<\/p>\n<pre><code><span style=\"color: black\">from IPython.display import Image\r\nImage(\"..\/..\/..\/images\/bootstrap_pic.png\")<\/span><\/code><\/pre>\n<p style=\"text-align: justify\"><img decoding=\"async\" class=\"alignnone size-large wp-image-699\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-3-1024x465.png\" alt=\"\" width=\"1024\" height=\"465\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-3-1024x465.png 1024w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-3-300x136.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-3-768x349.png 768w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-3-1536x698.png 1536w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-3-2048x931.png 2048w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-3-704x320.png 704w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p>\n<h2 id=\"uma-mediana-reamostrada\" style=\"text-align: justify\">Uma Mediana Reamostrada<\/h2>\n<p style=\"text-align: justify\">Lembre-se de que o m\u00e9todo <code>sample<\/code> extrai linhas de uma tabela com substitui\u00e7\u00e3o por padr\u00e3o, e quando \u00e9 usado sem especificar um tamanho de amostra, por padr\u00e3o o tamanho da amostra \u00e9 igual ao n\u00famero de linhas da tabela. Isso \u00e9 perfeito para o bootstrap! Aqui \u00e9 uma nova amostra extra\u00edda da amostra original e a mediana da amostra correspondente.<\/p>\n<pre><code><span style=\"color: black\">resample_1 = our_sample.sample()<\/span><\/code><\/pre>\n<pre><code><span style=\"color: black\">resample_1.select('Total Compensation').hist(bins=sf_bins)<\/span><\/code><\/pre>\n<p style=\"text-align: justify\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-700\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-4.png\" alt=\"\" width=\"464\" height=\"323\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-4.png 464w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-4-300x209.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-4-460x320.png 460w\" sizes=\"(max-width: 464px) 100vw, 464px\" \/><\/p>\n<pre><code><span style=\"color: black\">resampled_median_1 = percentile(50, resample_1.column('Total Compensation'))\r\nresampled_median_1<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[4]:<\/td>\n<td style=\"text-align: left\">141793.0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">Este valor \u00e9 uma estimativa da mediana da popula\u00e7\u00e3o.<\/p>\n<p style=\"text-align: justify\">Ao reamostrar repetidas vezes, podemos obter muitas dessas estimativas e, portanto, uma distribui\u00e7\u00e3o emp\u00edrica das estimativas.<\/p>\n<pre><code><span style=\"color: black\">resample_2 = our_sample.sample()\r\nresampled_median_2 = percentile(50, resample_2.column('Total Compensation'))\r\nresampled_median_2<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[4]:<\/td>\n<td style=\"text-align: left\">135880.0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">Vamos coletar esse c\u00f3digo e definir uma fun\u00e7\u00e3o <code>one_bootstrap_median<\/code> que retorna uma mediana inicializada da compensa\u00e7\u00e3o total, com base na inicializa\u00e7\u00e3o da amostra aleat\u00f3ria original que chamamos de <code>our_sample<\/code>.<\/p>\n<pre><code><span style=\"color: black\">def one_bootstrap_median():\r\n    resampled_table = our_sample.sample()\r\n    bootstrapped_median = percentile(50, resampled_table.column('Total Compensation'))\r\n    return bootstrapped_median<\/span><\/code><\/pre>\n<p style=\"text-align: justify\">Execute a c\u00e9lula abaixo algumas vezes para ver como as medianas inicializadas variam. Lembre-se de que cada uma delas \u00e9 uma estimativa da mediana da popula\u00e7\u00e3o.<\/p>\n<pre><code><span style=\"color: black\">one_bootstrap_median()<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[5]:<\/td>\n<td style=\"text-align: left\">132175.0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2 id=\"distribui-o-emp-rica-de-bootstrap-da-mediana-da-amostra\" style=\"text-align: justify\">Distribui\u00e7\u00e3o Emp\u00edrica de Bootstrap da Mediana da Amostra<\/h2>\n<p style=\"text-align: justify\">Agora podemos repetir o processo de bootstrap v\u00e1rias vezes executando um loop <code>for<\/code> como de costume. Em cada itera\u00e7\u00e3o, chamaremos a fun\u00e7\u00e3o <code>one_bootstrap_median<\/code> para gerar um valor da mediana bootstrapped com base em nossa amostra original <code>our_sample<\/code>. Ent\u00e3o anexaremos a mediana inicializada ao array de cole\u00e7\u00e3o <code>bstrap_medians<\/code>.<\/p>\n<p style=\"text-align: justify\">Como estamos solicitando 5.000 repeti\u00e7\u00f5es, o c\u00f3digo pode demorar um pouco para ser executado. H\u00e1 muita reamostragem a ser feita!<\/p>\n<pre><code><span style=\"color: black\">num_repetitions = 5000\r\nbstrap_medians = make_array()\r\nfor i in np.arange(num_repetitions):\r\n    bstrap_medians = np.append (bstrap_medians, one_bootstrap_median())<\/span><\/code><\/pre>\n<p style=\"text-align: justify\">Aqui est\u00e1 um histograma emp\u00edrico das 5.000 medianas bootstrap. O ponto verde \u00e9 o par\u00e2metro da popula\u00e7\u00e3o: \u00e9 a mediana de toda a popula\u00e7\u00e3o, que \u00e9 o que estamos tentando estimar. Neste exemplo, sabemos seu valor, mas n\u00f3s n\u00e3o o usei no processo de inicializa\u00e7\u00e3o.<\/p>\n<pre><code><span style=\"color: black\">resampled_medians = Table().with_column('Bootstrap Sample Median', bstrap_medians)\r\nmedian_bins=np.arange(120000, 160000, 2000)\r\nresampled_medians.hist(bins = median_bins)\r\n\r\n# Plotting parameters; you can ignore this code\r\nparameter_green = '#32CD32'\r\nplots.ylim(-0.000005, 0.00014)\r\nplots.scatter(pop_median, 0, color=parameter_green, s=40, zorder=2);<\/span><\/code><\/pre>\n<p style=\"text-align: justify\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-701\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-5.png\" alt=\"\" width=\"455\" height=\"328\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-5.png 455w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-5-300x216.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-5-444x320.png 444w\" sizes=\"(max-width: 455px) 100vw, 455px\" \/><\/p>\n<p style=\"text-align: justify\">\u00c9 importante lembrar que o ponto verde \u00e9 fixo: s\u00e3o 135.747 d\u00f3lares, a mediana da popula\u00e7\u00e3o. O histograma emp\u00edrico \u00e9 o resultado de sorteios aleat\u00f3rios e estar\u00e1 situado aleatoriamente em rela\u00e7\u00e3o ao ponto verde.<\/p>\n<p style=\"text-align: justify\">Lembre-se tamb\u00e9m de que o objetivo de todos esses c\u00e1lculos \u00e9 estimar a mediana da popula\u00e7\u00e3o, que \u00e9 o ponto verde. Nossas estimativas s\u00e3o todas as medianas amostrais geradas aleatoriamente, cujo histograma voc\u00ea v\u00ea acima. Queremos que o conjunto dessas estimativas contenha o par\u00e2metro. Se n\u00e3o contiver, ent\u00e3o as estimativas est\u00e3o erradas.<\/p>\n<h2 id=\"as-estimativas-capturam-o-par-metro-\" style=\"text-align: justify\">As Estimativas Capturam o Par\u00e2metro?<\/h2>\n<p style=\"text-align: justify\">Com que frequ\u00eancia o histograma emp\u00edrico das medianas reamostradas fica firmemente sobre o ponto verde, e n\u00e3o apenas o toca com suas caudas ou n\u00e3o o cobre de todo? Para responder a isso, devemos definir &#8220;ficar firmemente&#8221;. Vamos tomar isso como &#8220;os 95% centrais das medianas reamostradas cont\u00eam o ponto verde&#8221;.<\/p>\n<p style=\"text-align: justify\">Aqui est\u00e3o os dois extremos do intervalo dos &#8220;95% centrais&#8221; das medianas reamostradas:<\/p>\n<pre><code><span style=\"color: black\">left = percentile(2.5, bstrap_medians)\r\nleft<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[6]:<\/td>\n<td style=\"text-align: left\">129524.0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<pre><code><span style=\"color: black\">right = percentile(97.5, bstrap_medians)\r\nright<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[7]:<\/td>\n<td style=\"text-align: left\">143446.0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">A mediana populacional de 135.747 d\u00f3lares est\u00e1 entre esses dois n\u00fameros. O intervalo e a mediana populacional s\u00e3o mostrados no histograma abaixo.<\/p>\n<pre><code>resampled_medians.hist(bins = median_bins)\r\n\r\n# Plotting parameters; you can ignore this code\r\nplots.ylim(-0.000005, 0.00014)\r\nplots.plot([left, right], [0, 0], color='yellow', lw=3, zorder=1)\r\nplots.scatter(pop_median, 0, color=parameter_green, s=40, zorder=2);<\/code><\/pre>\n<p style=\"text-align: justify\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-702\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-6.png\" alt=\"\" width=\"455\" height=\"328\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-6.png 455w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-6-300x216.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-6-444x320.png 444w\" sizes=\"(max-width: 455px) 100vw, 455px\" \/><\/p>\n<p style=\"text-align: justify\">O intervalo dos &#8220;95% do meio&#8221; das estimativas capturou o par\u00e2metro em nosso exemplo. Mas foi um acaso?<\/p>\n<p style=\"text-align: justify\">Para ver com que frequ\u00eancia o intervalo cont\u00e9m o par\u00e2metro, temos que executar todo o processo repetidamente. Especificamente, replicaremos o seguinte processo 100 vezes:<\/p>\n<ul style=\"text-align: justify\">\n<li>Tirar uma amostra aleat\u00f3ria original de tamanho 500 da popula\u00e7\u00e3o.<\/li>\n<li>Realizar 5000 replica\u00e7\u00f5es do processo bootstrap e gerar o intervalo dos &#8220;95% do meio&#8221; das medianas reamostradas.<\/li>\n<\/ul>\n<p style=\"text-align: justify\">Terminaremos com 100 intervalos e contaremos quantos deles cont\u00eam a mediana da popula\u00e7\u00e3o.<\/p>\n<p style=\"text-align: justify\"><strong>Spoiler:<\/strong> A teoria estat\u00edstica do bootstrap diz que o n\u00famero deve ser em torno de 95. Pode estar no in\u00edcio dos 90 ou final dos 90, mas n\u00e3o esperamos que se desvie muito de 95.<\/p>\n<p style=\"text-align: justify\">Come\u00e7aremos escrevendo uma fun\u00e7\u00e3o <code>bootstrap_median<\/code> que leva dois argumentos: o nome da tabela contendo a amostra aleat\u00f3ria original e o n\u00famero de amostras bootstrap a serem retiradas. Ela retorna uma matriz de medianas bootstrap, uma de cada amostra bootstrap.<\/p>\n<pre><code><span style=\"color: black\">def bootstrap_median(original_sample, num_repetitions):\r\n    medians = make_array()\r\n    for i in np.arange(num_repetitions):\r\n        new_bstrap_sample = original_sample.sample()\r\n        new_bstrap_median = percentile(50, new_bstrap_sample.column('Total Compensation'))\r\n        medians = np.append(medians, new_bstrap_median)\r\n    return medians<\/span><\/code><\/pre>\n<p style=\"text-align: justify\">Agora vamos escrever um loop <code>for<\/code> que chama essa fun\u00e7\u00e3o 100 vezes e coleta os &#8220;95% intermedi\u00e1rios&#8221; das medianas inicializadas a cada vez.<\/p>\n<p style=\"text-align: justify\">A c\u00e9lula abaixo levar\u00e1 v\u00e1rios minutos para ser executada, pois precisa realizar 100 replica\u00e7\u00f5es de amostragem 500 vezes aleatoriamente na tabela e gerar 5.000 amostras inicializadas.<\/p>\n<pre><code><span style=\"color: black\"># A GRANDE SIMULA\u00c7\u00c3O: Esta leva v\u00e1rios minutos.\r\n\r\n# Gere 100 intervalos e coloque os pontos finais nos intervalos da tabela\r\n\r\nleft_ends = make_array()\r\nright_ends = make_array()\r\n\r\nfor i in np.arange(100):\r\n    original_sample = sf2019.sample(500, with_replacement=False)\r\n    medians = bootstrap_median(original_sample, 5000)\r\n    left_ends = np.append(left_ends, percentile(2.5, medians))\r\n    right_ends = np.append(right_ends, percentile(97.5, medians))\r\n\r\nintervals = Table().with_columns(\r\n    'Left', left_ends,\r\n    'Right', right_ends\r\n)    <\/span><\/code><\/pre>\n<p style=\"text-align: justify\">Para cada uma das 100 replica\u00e7\u00f5es de todo o processo, obtemos um intervalo de estimativas da mediana.<\/p>\n<pre><code><span style=\"color: black\">intervals<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-collapse: collapse;width: auto;margin-left: 1em\" border=\"1\">\n<thead>\n<tr style=\"background-color: #f0f0f0;border-bottom: 2px solid #ddd\">\n<th style=\"text-align: left;padding: 4px 8px\">Left<\/th>\n<th style=\"text-align: left;padding: 4px 8px\">Right<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">125093<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">139379<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">129925<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">140757<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">133955<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">146369<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">129335<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">140847<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">132756<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">145429<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">130167<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">143200<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">125935<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">138491<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">131092<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">142472<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">128509<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">140462<\/td>\n<\/tr>\n<tr style=\"background-color: #f8f8f8\">\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">131270<\/td>\n<td style=\"padding: 4px 8px;border: 1px solid #ddd\">145998<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">Os bons intervalos s\u00e3o aqueles que cont\u00eam o par\u00e2metro que estamos tentando estimar. Normalmente o par\u00e2metro \u00e9 desconhecido, mas nesta se\u00e7\u00e3o sabemos qual \u00e9 o par\u00e2metro.<\/p>\n<pre><code><span style=\"color: black\">pop_median<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[8]:<\/td>\n<td style=\"text-align: left\">135747.0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">Quantos dos 100 intervalos cont\u00eam a mediana da popula\u00e7\u00e3o? Esse \u00e9 o n\u00famero de intervalos em que a extremidade esquerda est\u00e1 abaixo da mediana da popula\u00e7\u00e3o e a extremidade direita est\u00e1 acima.<\/p>\n<pre><code><span style=\"color: black\">intervals.where(\r\n    'Left', are.below(pop_median)).where(\r\n    'Right', are.above(pop_median)).num_rows<\/span><\/code><\/pre>\n<table style=\"font-family: monospace;border-spacing: 0;border-collapse: collapse;width: auto;margin-left: 1em\">\n<tbody>\n<tr>\n<td style=\"text-align: right;color: #888;padding-right: 0.5em\">Out[9]:<\/td>\n<td style=\"text-align: left\">93<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"text-align: justify\">Leva muitos minutos para construir todos os intervalos, mas tente novamente se tiver paci\u00eancia. Muito provavelmente, cerca de 95 dos 100 intervalos ser\u00e3o bons: eles conter\u00e3o o par\u00e2metro.<\/p>\n<p style=\"text-align: justify\">\u00c9 dif\u00edcil mostrar todos os intervalos no eixo horizontal, pois eles t\u00eam grandes sobreposi\u00e7\u00f5es \u2013 afinal, todos est\u00e3o tentando estimar o mesmo par\u00e2metro. O gr\u00e1fico abaixo mostra cada intervalo nos mesmos eixos, empilhando-os verticalmente. O eixo vertical \u00e9 simplesmente o n\u00famero da replica\u00e7\u00e3o da qual o intervalo foi gerado.<\/p>\n<p style=\"text-align: justify\">A linha verde \u00e9 onde o par\u00e2metro est\u00e1. Ela tem uma posi\u00e7\u00e3o fixa, pois o par\u00e2metro \u00e9 fixo.<\/p>\n<p style=\"text-align: justify\">Bons intervalos cobrem o par\u00e2metro. Tipicamente, h\u00e1 aproximadamente 95 deles.<\/p>\n<p style=\"text-align: justify\">Se um intervalo n\u00e3o cobre o par\u00e2metro, ele \u00e9 um fracasso. Os fracassos s\u00e3o aqueles onde voc\u00ea pode ver &#8220;claridade&#8221; ao redor da linha verde. Normalmente, h\u00e1 muito poucos \u2013 cerca de 5 em 100 \u2013 mas eles acontecem.<\/p>\n<p style=\"text-align: justify\">Qualquer m\u00e9todo baseado em amostragem tem a possibilidade de estar errado. A beleza dos m\u00e9todos baseados em amostragem aleat\u00f3ria \u00e9 que podemos quantificar com que frequ\u00eancia eles provavelmente estar\u00e3o errados.<\/p>\n<pre><code><span style=\"color: black\">replication_number = np.ndarray.astype(np.arange(1, 101), str)\r\nintervals2 = Table(replication_number).with_rows(make_array(left_ends, right_ends))\r\n\r\nplots.figure(figsize=(8,8))\r\nfor i in np.arange(100):\r\n    ends = intervals2.column(i)\r\n    plots.plot(ends, make_array(i+1, i+1), color='gold')\r\nplots.scatter(pop_median, 0, color=parameter_green, s=40, zorder=2)\r\nplots.plot(make_array(pop_median, pop_median), make_array(0, 100), color=parameter_green, lw=2)\r\nplots.xlabel('Median (dollars)')\r\nplots.ylabel('Replication')\r\nplots.title('Population Median and Intervals of Estimates');<\/span><\/code><\/pre>\n<p style=\"text-align: justify\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-703\" src=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-7.png\" alt=\"\" width=\"568\" height=\"541\" srcset=\"https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-7.png 568w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-7-300x286.png 300w, https:\/\/literaciadigital.ufms.br\/files\/2025\/07\/13-2-7-336x320.png 336w\" sizes=\"(max-width: 568px) 100vw, 568px\" \/><\/p>\n<p style=\"text-align: justify\">Para resumir o que a simula\u00e7\u00e3o mostra, suponha que voc\u00ea esteja estimando a mediana da popula\u00e7\u00e3o pelo seguinte processo:<\/p>\n<ul style=\"text-align: justify\">\n<li>Tire uma grande amostra aleat\u00f3ria da popula\u00e7\u00e3o.<\/li>\n<li>Fa\u00e7a o bootstrap de sua amostra aleat\u00f3ria e obtenha uma estimativa da nova amostra aleat\u00f3ria.<\/li>\n<li>Repita a etapa de bootstrap acima milhares de vezes e obtenha milhares de estimativas.<\/li>\n<li>Selecione o intervalo dos &#8220;95% do meio&#8221; de todas as estimativas.<\/li>\n<\/ul>\n<p style=\"text-align: justify\">Isso lhe d\u00e1 um intervalo de estimativas. Se 99 outras pessoas repetirem <strong>todo o processo<\/strong>, come\u00e7ando com uma nova amostra aleat\u00f3ria a cada vez, ent\u00e3o voc\u00ea terminar\u00e1 com 100 desses intervalos. Cerca de 95 desses 100 intervalos conter\u00e3o o par\u00e2metro da popula\u00e7\u00e3o.<\/p>\n<p style=\"text-align: justify\">Em outras palavras, esse processo de estimativa captura o par\u00e2metro em cerca de 95% das vezes.<\/p>\n<p style=\"text-align: justify\">Voc\u00ea pode substituir 95% por um valor diferente, desde que n\u00e3o seja 100. Suponha que voc\u00ea substitua 95% por 80% e mantenha o tamanho da amostra fixo em 500. Ent\u00e3o seus intervalos de estimativas ser\u00e3o mais curtos do que os simulados aqui, porque os &#8220;80% do meio&#8221; \u00e9 um intervalo menor do que os &#8220;95% do meio&#8221;. Se voc\u00ea continuar repetindo esse processo, apenas cerca de 80% dos seus intervalos conter\u00e3o o par\u00e2metro.<\/p>\n<p>&nbsp;<\/p>\n<p><!--###########################################################################################################################################################--><\/p>\n<table width=\"100%\">\n<tbody>\n<tr>\n<td align=\"left\"><a class=\"next-page-link\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/13-0\/13-1\/\">\u2190 Cap\u00edtulo 13.1 &#8211; Percentis<\/a><\/td>\n<td align=\"right\"><a class=\"next-page-link\" href=\"https:\/\/literaciadigital.ufms.br\/data8\/13-0\/13-3\/\">Cap\u00edtulo 13.3 &#8211; Intervalos de Confian\u00e7a \u2192<\/a><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><!--###########################################################################################################################################################--><\/p>\n<\/div>\n<\/div>\n<div style=\"clear: both;height: 1px;margin-top: -1px\"><\/div>\n","protected":false},"excerpt":{"rendered":"<p>\u00cdndice 1. O que \u00e9 Ci\u00eancia de Dados? 1.1. Introdu\u00e7\u00e3o 1.1.1. Ferramentas Computacionais 1.1.2. T\u00e9cnicas Estat\u00edsticas 1.2. Por que Ci\u00eancia de Dados? 1.3. Tra\u00e7ando os Cl\u00e1ssicos 1.3.1. Personagens Liter\u00e1rios 1.3.2. Outro Tipo de Personagem 2. Causalidade e Experimentos 2.1. John Snow e a Bomba da Broad Street 2.2. O &#8220;Grande Experimento&#8221; de Snow 2.3. Estabelecendo [&hellip;]<\/p>\n","protected":false},"author":21894,"featured_media":0,"parent":687,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"page-templates\/full-width.php","meta":{"footnotes":""},"coauthors":[14],"class_list":["post-696","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/pages\/696","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/users\/21894"}],"replies":[{"embeddable":true,"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/comments?post=696"}],"version-history":[{"count":3,"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/pages\/696\/revisions"}],"predecessor-version":[{"id":1046,"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/pages\/696\/revisions\/1046"}],"up":[{"embeddable":true,"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/pages\/687"}],"wp:attachment":[{"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/media?parent=696"}],"wp:term":[{"taxonomy":"author","embeddable":true,"href":"https:\/\/literaciadigital.ufms.br\/en\/wp-json\/wp\/v2\/coauthors?post=696"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}