Adonis Diaries

Posts Tagged ‘Big Data

NEURAL LEARNING AI: Machine learning the internet of things

Note: it is a propaganda piece, but it is okay if it helps learning

Six Introductory Terms & The Five Effects

Predictive Analytics Terminology

1. Predictive analytics

Technology that learns from experience (data) to predict the future behavior of individuals in order to drive better decisions.

In this definition, individuals is a broad term that can refer to people as well as other organizational elements. Most examples in this book involve predicting people, such as customers, debtors, applicants, employees, students, patients, donors, voters, taxpayers, potential suspects, and convicts.

Predictive analytics also applies to individual companies (e.g., for business-to-business), products, locations, restaurants, vehicles, ships, flights, deliveries, buildings, manholes, transactions, Facebook posts, movies, satellites, stocks, Jeopardy! questions, and much more.

Whatever the domain, PA renders predictions over scalable numbers of individuals.

2. Predictive Model

A mechanism that predicts a behavior of an individual, such as click, buy, lie, or die.

It takes characteristics (variables) of the individual as input and provides a predictive score as output. The higher the score, the more likely it is that the individual will exhibit the predicted behavior. (Basically, a mathematical equation)

3. Artificial Intelligence

Advanced machine capabilities that are by definition impossible to achieve since, once achieved, they have necessarily been trivialized (by way of being mechanized) and are therefore not impressive in the subjective sense of intelligence, so they no longer qualify. (Qualify for what?)

The word intelligence has no formal definition, so why use it in an engineering context? (Expressing the opinion of experts in the field domain in matter of what can go wrong?)

I still feel like IBM Watson seems truly intelligent when watching it play the TV quiz show Jeopardy!

This definition is not an excerpt from the book Predictive Analytics, but it does summarize one of my conclusions in the book chapter on Watson.

4. Uplift Model

A type of predictive model that predicts the influence on an individual’s behavior that results from applying one treatment over another. Synonyms include: differential response, impact, incremental impact, incremental lift, incremental response, net lift, net response, persuasion, true lift, or true response model.

The uplift score output by and uplift model answers the question: “How much more likely is this treatment to generate the desired outcome than the alternative treatment”?

For more information, see the article Personalization Is Back: How to Drive Influence by Crunching Numbers (which includes links for further reading at the end), Chapter 7 of Predictive Analytics, and, for more technical citations, the Notes corresponding to that chapter, which may be downloaded as a PDF.

5. Vast Search

The term that industry leader (and Chapter 1 predictive investor) John Elder coined for predictive modelings intrinsic automation of testing many predictor variables and the associated peril of stumbling across a correlation with the target variable that may be perceived as significant if considered in isolation without considering the search that was employed to unearth it.

But that in fact was only due to random perturbations.

Synonyms include: multiple comparisons trap, multiple hypothesis testing, researcher degrees of freedom, over-search (akin to over-fit), look-elsewhere effect, the garden of forking paths, fishing expedition, cherry-picking findings, data dredging, significance chasing, and p-hacking. (Domain of statistical analysis methods and tools)

For more information, see my article HBO Teaches You How to Avoid Bad Science.

Chapter 3 of the 2016 updated edition of my book, Predictive Analytics, and, for more technical citations, the Notes corresponding to that chapter, which may be downloaded as a PDF.

6. Automatic Suspect Discovery (ASD)

In law enforcement, the identification of previously unknown potential suspects by applying predictive analytics to flag and rank individuals according to their likelihood to be worthy of investigation, either because of their direct involvement in, or relationship to, criminal activities. (Biases related to racist behaviors, elite classes world view, and mental training?)

Further info: This topic is explored in a special sidebar on the NSA use of predictive analytics within the ethics and privacy-focused chapter 2 of Predictive Analytics. Also see my Newsweek op-ed on this topic.

ASD provides a novel means to unearth new suspects.

Using it, law enforcement can hunt scientifically, more effectively targeting its search by applying predictive analytics, the same state-of-the-art, data-driven technology behind fraud detection, financial credit scoring, spam filtering, and targeted marketing. (Application to target specific classes of people, and saying that science is supporting the actions)

ASD flags new persons of interest who may then be elevated to suspect by an ensuing investigation.

By the formal law enforcement definition of the word, an individual would not be classified as a suspect by a computer, only by a (trained enforcement agent)?

 16h16 hours ago (posted on Twitter)

Post navigation

Soon, your Privacy is Privatized: To be purchased

In the 1960s, mainframe computers posed a significant technological challenge to common notions of privacy. That’s when the federal government started putting tax returns into those giant machines, and consumer credit bureaus began building databases containing the personal financial information of millions of Americans.

Many people feared that the new computerized databanks would be put in the service of an intrusive corporate or government Big Brother.

 published in NYT this March 23, 2013 under “Big Data Is Opening Doors, but Maybe Too Many”

“It really freaked people out,” says Daniel J. Weitzner, a former senior Internet policy official in the Obama administration. “The people who cared about privacy were every bit as worried as we are now.”

Along with fueling privacy concerns, the mainframes helped prompt the growth and innovation that we have come to associate with the computer age.

Today, many experts predict that the next wave will be driven by technologies that fly under the banner of Big Data — data including Web pages, browsing habits, sensor signals, smartphone location trails and genomic information, combined with clever software to make sense of it all.

Proponents of this new technology say it is allowing us to see and measure things as never before — much as the microscope allowed scientists to examine the mysteries of life at the cellular level.

Big Data, they say, will open the door to making smarter decisions in every field from business and biology to public health and energy conservation.

“This data is a new asset,” says Alex Pentland, a computational social scientist and director of the Human Dynamics Lab at the M.I.T. “You want it to be liquid and to be used.”

But the latest leaps in data collection are raising new concern about infringements on privacy — an issue so crucial that it could trump all others and upset the Big Data bandwagon.

Dr. Pentland is a champion of the Big Data vision and believes the future will be a data-driven society. Yet the surveillance possibilities of the technology, he acknowledges, could leave George Orwell in the dust.

The World Economic Forum published a report late last month that offered one path — one that leans heavily on technology to protect privacy.

The report grew out of a series of workshops on privacy held over the last year, sponsored by the forum and attended by government officials and privacy advocates, as well as business executives.

The corporate members, more than others, shaped the final document.

The report, “Unlocking the Value of Personal Data: From Collection to Usage,” recommends a major shift in the focus of regulation toward restricting the use of data.

Curbs on the use of personal data, combined with new technological options, can give individuals control of their own information, according to the report, while permitting important data assets to flow relatively freely.

“There’s no bad data, only bad uses of data,” (even if false and erroneous?) says Craig Mundie, a senior adviser at Microsoft, who worked on the position paper.

The report contains echoes of earlier times. The Fair Credit Reporting Act, passed in 1970, was the main response to the mainframe privacy challenge. The law permitted the collection of personal financial information by the credit bureaus, but restricted its use mainly to three areas: credit, insurance and employment.

The forum report suggests a future in which all collected data would be tagged with software code that included an individual’s preferences for how his or her data is used.

All uses of data would have to be registered, and there would be penalties for violators.

For example, one violation might be a smartphone application that stored more data than is necessary for a registered service like a smartphone game or a restaurant finder.

The corporate members of the forum say they recognize the need to address privacy concerns if useful data is going to keep flowing.

George C. Halvorson, chief executive of Kaiser Permanente, the large health care provider, extols the benefits of its growing database on 9 million patients, tracking treatments and outcomes to improve care, especially in managing costly chronic and debilitating conditions like heart disease, diabetes and depression.

New smartphone applications, he says, promise further gains — for example, a person with a history of depression whose movement patterns slowed sharply would get a check-in call.

“We’re on the cusp of a golden age of medical science and care delivery,” Mr. Halvorson says. “But a privacy backlash could cripple progress.”

Corporate executives and privacy experts agree that the best way forward combines new rules and technology tools. But some privacy professionals say the approach in the recent forum report puts way too much faith in the tools and too little emphasis on strong rules, particularly in moving away from curbs on data collection.

“We do need use restrictions, but there is a real problem with getting rid of data collection restrictions,” says David C. Vladeck, a professor of law at Georgetown University. “And that’s where they are headed.”

“I don’t buy the argument that all data is innocuous until it’s used improperly,” adds Mr. Vladeck, former director of the Bureau of Consumer Protection at the Federal Trade Commission.

Vladeck offers this example: Imagine spending a few hours looking online for information on deep fat fryers. You could be looking for a gift for a friend or researching a report for cooking school. But to a data miner, tracking your click stream, this hunt could be read as a telltale signal of an unhealthy habit — a data-based prediction that could make its way to a health insurer or potential employer.

Dr. Pentland, an academic adviser to the World Economic Forum’s initiatives on Big Data and personal data, agrees that limitations on data collection still make sense, as long as they are flexible and not a “sledgehammer that risks damaging the public good.”

He is leading a group at the M.I.T. Media Lab that is at the forefront of a number of personal data and privacy programs and real-world experiments. He espouses what he calls “a new deal on data” with three basic tenets: you have the right to possess your data, to control how it is used, and to destroy or distribute it as you see fit.

Personal data, Dr. Pentland says, is like modern money — digital packets that move around the planet, traveling rapidly but needing to be controlled. “You give it to a bank, but there’s only so many things the bank can do with it,” he says.

His M.I.T. group is developing tools for controlling, storing and auditing flows of personal data. Its data store is an open-source version, called openPDS.

In theory, this kind of technology would undermine the role of data brokers and, perhaps, mitigate privacy risks. In the search for a deep fat fryer, for example, an audit trail should detect unauthorized use.

Dr. Pentland’s group is also collaborating with law experts, like Scott L. David of the University of Washington, to develop innovative contract rules for handling and exchanging data that insures privacy and security and minimizes risk.

The M.I.T. team is also working on living lab projects. One that began recently is in the region around Trento, Italy, in cooperation with Telecom Italia and Telefónica, the Spanish mobile carrier.

About 100 young families with young children are participating. The goal is to study how much and what kind of information they share on smartphones with one another, and with social and medical services — and their privacy concerns.

“Like anything new,” Dr. Pentland says, “people make up just-so stories about Big Data, privacy and data sharing,” often based on their existing beliefs and personal bias. “We’re trying to test and learn,” he says.

A version of this article appeared in print on March 24, 2013, on page BU3 of the New York edition with the headline: Big Data Is Opening Doors, But Maybe Too Many.

adonis49

adonis49

adonis49

May 2021
M T W T F S S
 12
3456789
10111213141516
17181920212223
24252627282930
31  

Blog Stats

  • 1,468,521 hits

Enter your email address to subscribe to this blog and receive notifications of new posts by email.adonisbouh@gmail.com

Join 802 other followers

%d bloggers like this: