Bullshit isn’t what it used to be. Now, two science professors give us the tools to dismantle misinformation and think clearly in a world of fake news and bad data. “A modern classic . . . a straight-talking survival guide to the mean streets of a dying democracy and a global pandemic.”—Wired Misinformation, disinformation, and fake news abound and it’s increasingly difficult to know what’s true. Our media environment has become hyperpartisan. Science is conducted by press release. Startup culture elevates bullshit to high art. We are fairly well equipped to spot the sort of old-school bullshit that is based in fancy rhetoric and weasel words, but most of us don’t feel qualified to challenge the avalanche of new-school bullshit presented in the language of math, science, or statistics. In Calling Bullshit, Professors Carl Bergstrom and Jevin West give us a set of powerful tools to cut through the most intimidating data. You don’t need a lot of technical expertise to call out problems with data. Are the numbers or results too good or too dramatic to be true? Is the claim comparing like with like? Is it confirming your personal bias? Drawing on a deep well of expertise in statistics and computational biology, Bergstrom and West exuberantly unpack examples of selection bias and muddled data visualization, distinguish between correlation and causation, and examine the susceptibility of science to modern bullshit. We have always needed people who call bullshit when necessary, whether within a circle of friends, a community of scholars, or the citizenry of a nation. Now that bullshit has evolved, we need to relearn the art of skepticism.
Dos profesores de ciencias nos brindan herramientas para desmantelar la desinformación y pensar con claridad en un mundo de noticias falsas y datos erróneos. Abunda la mala información y cada vez es más difícil saber qué es verdad. Los políticos no están limitados por los hechos y nuestro entorno mediático se ha vuelto hiperpartidista. La ciencia se lleva a cabo mediante comunicados de prensa, la cultura de las startups ha disparado el arte de crear bulos y la mayor parte de la actividad administrativa, pública o privada, parece ser poco más que un ejercicio sofisticado de reensamblaje combinatorio de disparates. Estamos bastante bien equipados para detectar el tipo de mentiras de la vieja escuela que se basan en una retórica elegante y eufemismos, pero la mayoría de nosotros no nos sentimos preparados para desafiar la avalancha de bulos modernos presentes en el lenguaje de las matemáticas, la ciencia o las estadísticas. Basándose en una profunda experiencia en estadística y biología computacional, Bergstrom y West desentrañan abundantes ejemplos de sesgo de selección y visualización confusa de datos, distinguen entre correlación y causalidad y examinan la susceptibilidad de la ciencia a los bulos modernos. Ahora que esas mentiras han evolucionado, necesitamos volver a aprender el arte del escepticismo.
☆三位諾貝爾獎得主重磅推薦 ☆Amazon 800位讀者好評分享 高度量化時代,數字絕對更會騙人,騙更大! 統計、圖表、懶人包,常是理性裝扮的鬼扯, 點贊、分享、演算法,助長類事實瘋傳成禍。 掌握數據的底層邏輯,洞悉科學量化包裝術, 資訊批判思考力,是最強自我保護力! ★經匯率調整,敝公司績效最佳的全球型基金在過去九年中有七年優於大盤。 ★儘管在統計上未達顯著,本研究結果凸顯此標靶質子治療具臨床重要之效果量,挑戰了現行的治療典範。 類似上面的敘述,你可能也看過,是不是躍躍欲試? 別興奮得太早。你發現了嗎,上面的敘述完全沒說收益表現的調整方法究竟是什麼?有幾檔基金表現不如大盤?差多少?九年中有七年表現優於大盤的是同一檔基金嗎?另外,具臨床重要性卻未達統計顯著的研究結果,到底代表什麼? 數學、統計與科學都是理性、客觀、精確的代表, 但也是資訊時代更容易操弄人心的騙術,而且更難被識破! 有圖有照片不一定有真相,數字表格簡單清楚其實更容易藏貓膩, 大數據陷阱多多更容易扯大謊。 如何偵測科學鬼扯?如何識破數據資料不合邏輯的破綻? 是現在深度偽造時代非常重要的自保能力。 兩位作者在華盛頓大學開設同名課程,受到極高的討論和迴響,他們運用統計與生物學領域的專精知識和經驗,以生動幽默的方式,拆解取樣偏誤與數據資料數位化混淆視聽的案例,檢視我們的生活多麼容易受到各類數據假象的影響。只要善用本書的思考方式,人人都能察覺資料有問題,拆穿假象: ◎圖表可能誤導:正統圖表看來無趣又複雜,如果刻意將縱軸上下顛倒,柱狀圖的條形不從0點開始呢?大眾可能被誤導而不自知。媒體只想提高點擊率,有趣或吸引注意力比正確性重要。 ◎數字會說謊:有可能算錯數量、小型取樣無法精確反映整體的特性、推算的程序與方法有誤,幫原本薄弱的主張建立可信度,成為散播謠言的載具。 ◎資料扭曲事實:新聞宣稱,科技公司市值在發布財報後,蒸發千億,還附上近四天股價走勢圖,若拉長至五年期來看,公司表現並不差。謠傳電視台因取消高收視節目而股價大跌,但是股價跌是在節目取消之前,節目收益是總營收的0.1%,有可能造成股價重挫2.5%? ◎機器可能出錯:電腦可以分辨狼和哈士奇?演算法並非注意兩者的面部特徵,而是從狼和雪景一起出現來判斷,但如果是在雪中的哈士奇呢?人工智慧的判斷會出現偏誤。 好評推薦 Jenny |「JC財經觀點」版主 呂昱達|丹尼老師的公民教室 吳媛媛|瑞典觀察作者 黃哲斌|《天下雜誌》專欄作者 雷浩斯|價值投資者/財經作家 詹益鑑|Taiwan Global Angels創辦人 羅世宏|中正大學傳播系教授 2001諾貝爾經濟學獎得主 喬治.阿克勞夫 2011諾貝爾物理學獎得主 沙爾.柏木特 2018諾貝爾經濟獎得主 保羅.羅莫 (推薦人依姓氏筆畫排列) 如果要讀一本必成經典的書,買這本書就對了!它處理我們這個時代最重要的議題:真相不再受人尊重。這本書同時也是文學傑作,頁頁都有新的樂趣,而且段段如此。—2001年諾貝爾經濟學獎得主 喬治.阿克勞夫(George Akerlof) 拜讀作者提出的「鬼扯」範例讓我又哭又笑。如果你在意算術和科學的關聯,而且想知道我們如何上當被騙,這本書會讓你讀到欲罷不能,也是我們這個時代必備的書籍。— 2011諾貝爾物理學獎得主 加州柏克萊大學物理教授沙爾.柏木特(Saul Perlmutter) 今時此刻,我們的周遭處處充滿大家無力辨知的騙局,每個人都在奮力掙扎,想辦法突破。本書教大家看出鬼扯、拒絕鬼扯、不讓鬼扯得逞的方法。—2018年諾貝爾經濟獎得主 保羅.羅莫(Paul Romer) 在不實或誤導資訊充斥的今日,加強防禦力是當務之急。本書兩位作者在清晰的架構下,使用深入淺出的例子,手把手為我們說明如何識別出數據資訊當中的「鬼扯」元素,打下抵抗數據假象的基礎。— 吳媛媛(瑞典觀察作者) 我在高中教導學生媒體識讀,在課堂上陳述理論、羅列案例,但也常常受限於社會領域而單打獨鬥,並自我懷疑所做的努力,是否真的能涵養下一代批判思考的能力。本書作為一盞明燈,提點了第一線的教育工作者,假訊息澄清總是緩不濟急,培養識讀能力使謠言止於智者,才是唯一正解。作者幽默風趣的筆觸,更能讓讀者在閱讀過程一邊破解鬼扯,一邊開懷大笑。誠摯推薦給各位。— 呂昱達(丹尼老師的公民教室創辦人) 我自己讀了這本書之後,深感受益匪淺,熱切地想向身邊的師友推薦這本書。有了這本書的思想武裝及技術升級,我相信,人人都可以從容面對這個充斥胡說八道、虛假資訊無所不在的時代,做一個不受人惑的自由人。— 羅世宏 (中正大學傳播系教授) 在我們這個時代,瞎扯也用於造假新聞、不實資訊,以及網路戰。通訊軟體上隨時會收到的各種消息以淹沒你的注意力,這些事情在在以行動告訴我們:這是一個鬼扯不斷奪取你注意力的時代。所以我們確確實實的需要這本「面對虛假數據的求生指南」。— 雷浩斯(價值投資者/財經—作家) 我們之所以要學會辨別與識破鬼扯的能力,不僅在避免浪費時間,更重要的是把心智與注意力放在能夠長期累積與產生複利效應的事情上。—詹益鑑(Taiwan Global Angels創辦人) 一本現代經典之作。在民主奄奄一息、疫情全球肆虐的此刻,這本直言不諱的書是險惡街頭的生存指南。—《連線》雜誌 連門外漢也能消化的書!讀本書之前我從來沒有修過任何一門統計課,至於談到真正的科學,不只我是門外漢,我身邊朋友也是。近來我一直看到自己的部分朋友轉傳科學研究,企圖合理化越來越極端的看法,我卻覺得自己無力介入當個理性的聲音,因為即便這些研究有方法上的瑕疵,我也無法挑出來。這本書滿是重要的觀點,讀起來很容易,還有絕佳的幽默感穿插其中。謝謝提供像我這樣的人對去抗草包族科學的工具!— 亞馬遜讀者 阿莫爾 幾年前,我在家自學的女兒以華盛頓大學教授的系列課程作為高中自學教材,得知他們寫成了一本書,我相當開心!我兒子今年會拿這本書和影片當成「媒體識讀」單元的內容。這不但是一本好看的書,也是相當受用的分析。作者結合知識與幽默恰到好處地將圖表和取樣偏誤這類的觀念變得十分有趣。—亞馬遜讀者 佩遜斯 能修到這門課的大學生都非常幸運。此書如果能改版給高中生開一門「對抗邪魔歪道的防身課」,那就更好了。— 亞馬遜讀者 馬希 這是一本幫助你了解資訊泥淖的絕佳好書,討論影響所有世人的主題。本書內容簡單,讓人能輕鬆閱讀理解,不至於坐困在統計數字、技術詞彙,或教授才會講的專業用語中。—亞馬遜讀者 塔曼 這本書可以吸引每一種讀者:純粹對騙術與真相感興趣的人會想讀,經常要處理這種事的科學家也會想讀。本書內容完美摻雜了作者生活上的小故事與讓人讀得懂的技術資料解釋,而且還文筆極佳!— 亞馬遜讀者 史塔爾茲
No image available
· 2011
In this paper, we show how the Eigenfactor® score, originally designed for ranking scholarly journals, can be adapted to rank the scholarly output of authors, institutions, and countries based on author-level citation data. Using the methods described herein, we provide Eigenfactor rankings for 84,808 disambiguated authors of 240,804 papers in the Social Science Research Network (SSRN)|a pre and post-print archive devoted to the rapid dissemination of scholarly research in the social sciences and humanities. As an additive metric, the Eigenfactor scores are readily computed for collectives such as departments or institutions as well. We show that a collective's Eigenfactor score can be computed either by summing the Eigenfactor scores of its members, or by working directly with a collective-level cross-citation matrix. To illustrate, we provide Eigenfactor rankings for institutions and countries in the SSRN repository. With a network-wide comparison of Eigenfactor scores and download tallies, we demonstrate that Eigen- factor scores provide information that is both different from and complementary to that provided by download counts. We see author-level ranking as one filter for navigating the scholarly literature, and note that such rankings generate incentives for more open scholarship, as authors are rewarded for making their work available to the community as early as possible and prior to formal publication.
No image available
Open access publishing has been proposed as one possible solution to the serials crisis - the rapidly growing subscription prices in scholarly journal publishing. However, open access publishing can present economic pitfalls as well, such as excessive article processing charges. We discuss the decision that an author faces when choosing to submit to an open access journal. We develop an interactive tool to help authors compare among alternative open access venues and thereby get the most for their article processing charges.
No image available
· 2011
Funding agencies and policy makers in science tout the importance of interdisciplinary and transdisciplinary research -- bring together experts from multiple domains and good things happen. But what are these good things and can they be measured? Scholarly networks are a good model system for answering these questions. Looking at millions of collaborations between scientists across disciplinary and generational boundaries, we are developing metrics and algorithms that detect the ontogeny of innovative ideas and technological breakthroughs that lead to new fields and sub-fields of science.
No image available
In January 2016, the Canadian Ministry of Innovation, Science and Economic Development (the “Ministry”) commissioned this study to gain insight into technological innovation in Canada through the lens of the United States (“U.S.”) patent system. Comprehensive patent data of the sort required to make direct measurements of the Canadian patent system is not yet available in a form that lends itself to direct analysis, in particular patent data and metadata available as bulk downloads. In lieu of this, we use two massive data sets that we have assembled: (1) a dataset containing detailed information about patent prosecution in the U.S. from 1976 to the present, and (2) a dataset containing all federal U.S. patent litigations from 2000 to 2014 that resulted in a published opinion or order. We parsed these two datasets to derive a data subset containing all U.S. patents having at least one Canadian inventor (“Canadian-Contribution” or “CanCon” patents). The detailed information in the CanCon data subset has allowed direct analysis of such variables as inventor nationality (e.g., Canadian), numbers of inventors, application and issuance years, technology areas, assignees, prosecuting United States Patent and Trademark Office (“USPTO”) examiners, prosecuting attorneys, and/or prosecuting law firms. We constructed a comprehensive patent citation graph of all post-1975 patents, and used this graph to determine patent importance within this network using eigenfactor centrality methods. The results of our analyses offer new insights into the inventive behavior of Canadians and owners of patents that co-invent with Canadians. For example, in total there are 170,067 U.S. patents having at least one Canadian inventor. The mean importance of these patents is 1.133, which means that CanCon patents are, on average, 13% more important than the mean importance for all U.S. patents, which is 1.00. By comparison, the mean importance of all U.S. patents having at least one foreign inventor is 0.95, or of slightly below average importance. The technological fields having the largest numbers of CanCon patents tend to involve telecommunications, computers and computer software, pharmaceuticals, molecular biology, chemistry, land vehicles, and the design of furnishings. With respect to patents litigated in U.S. federal courts, CanCon patents are less likely to be the subject of a federal lawsuit, and, those CanCon patents that are litigated tend to be of lower importance than the average for litigated patents.
No image available