Raubbau im Ökosystem der Daten? –Welche (Missbrauchs-‐)Möglichkeiten bietet Data-‐Mining in Social Media oder anderen persönlichen Datenräumen?
PD Dr. Georg Groh
Social Computing Research GroupFakulät für Informatik
Social Media
Social Media: Characteristics
● openness: admissability, low technical barriers
● emphasis on user generated content
● emphasis on supportinguserinteraction / communication(especially 1:n or n:m)
● fast dynamics
● users act as prosumers
● social informationprocessingparadigm: collectively solve problems beyond individual capabilities [Lermann2007 in Groh, 2012] → e.g. crowdsourcing, Wikipedia
● emergent social effects: e.g. ○ 2007 Southern California wildfire
[Sutton et al., 2008 in Groh, 2012];
○ Fukushima 2011 radiation levels measurements [par, 2012; in Groh, 2012]
○ Arab Spring phenomenon [DeLong-‐Bas, 2012 ; in Groh, 2012].
Social Media: Characteristics (contd.)
user user
item item
e.g. social relations
e.g. tags, ratings
e.g. tags, folksonomies,semantic metadata
● Users collaboratively explicate / model relations of various kinds:
e.g. tags, ratings
● user ←→ user relations (and someuser←→ item relations) : maybeinterpreted / labeled as Social Context
Social Media Characteristics: Social Context
● Social Context: models of any aspects of social interaction between usersin relation to IT systems (hardware, platforms, services etc.) (andinstantiationsof thesemodels)
○ explicitly provided (example: Facebook friendship) vs. won via sensors + instantiatingmodels (example: Social Situation)
○ short term (example: co-‐activity) vs. long term (example: social network)
○ „within“ the IT system itself (example: Facebook „like“) vs. „outside“ the IT systembut related to it (e.g. used in) (example: mutual emotional attitude of persons usinga tabletop-‐based creativity supportsystem)
○ binary (example: friendship) vs. n-‐ary (example: group)
○ explicit use (example: Facebook friendships controllingaccess) vs. implicit use (ex.: interruptibilitymanagementvia interactiondetection)
Social Media Characteristics: Ultra-‐short Essence
Social Media→easily editable / expandable , socially accessible
Web-‐content + social context
Ultra-‐short Essence:
Social Media Technologies
○ basic Web protocols (e.g. HTTP(S))
○ languages for declarative representation of structure, actual content, and format of content (e.g. HTML5, XML + related (e.g. XSLT)), specialized XML languages (e.g. GML))
○ Semantic Web languages (e.g. RDF(S), OWL, SPARQL), Social Semantic Web Ontologies (e.g. SIOC, FOAF)
○ client-‐side technologies (e.g. Flash, JavaScript, JSON, AJAX, Silverlight)
○ server-‐side technologies (e.g. PHP, JSP, ASP, Ruby on Rails, Spring, Databases)
○ syndication and mash-‐up of content (e.g. RSS, Atom)○ Social Software (e.g. Elgg, MediaWiki)○ …
[iNCBEAT 2013]
general enabler technologies for Social Media: technologies for building general Rich Internet Applications (RIAs) or Web-‐applications (see e.g. [Shklar and Rosen, 2009; in Groh, 2012]):
[SemanticFocus, 2013]
Social Media Classes
Blogs
Microblogs
Wikis
Discus-‐sionBoards
@ (Messa-‐ging)
(IP-‐Tele-‐phony)
(Chat)
Social Games
(Revision Control)
(Content Manage-‐ment)
Open Innova-‐tionplatforms
Collabo-‐rativeCreativity services
(Know-‐ledge Codifi-‐cation)
Social Networ-‐king platforms
Mobile Social Networ-‐king
Location-‐Based S.Netw.
Profes-‐sionalS.Netw.
Corpo-‐rate S.Netw.
PartnerFindingplatf.
C Com-‐munityplatf.
AltruisticCom-‐munityplatf.
PoliticalCom-‐munityplatf.
Eventplatf.
Newsplatf.
Social Search
Quest-‐ionAns-‐wering
Infor-‐mationAggre-‐gation
(Docu-‐mentMgmnt.)
↔ Content Sharing
File Sharing
Video Sharing
Photo Sharing
TeachingMaterial Sharing
Social Book-‐marking
ProductRating
R Recom-‐mender Systems
Social Computing: Coarse Definition
Social Computing:Interdisciplinary field (mostly informatics) investigating, modeling
and using social context (i.e. all aspects of human socialinteraction in / with / around IT systems) in view of increasing
the utilityof the respective IT systems for the users
coarse definition:
Social Computing: Disciplines of Informatics with High Overlap
● Social Signal Processing
● Network Analysis and Social Network Analysis
● Social Network / Social Context Visualization
● Recommender Systems
● Social Media Analysis / Web Science
● Awareness Systems
● (Privacy Management)
● (Game Theory)
● (Robotics, Distributed AI (MAS), Distributed Systems)
● (AI, Machine Learning, „Big Data“ Data-‐Mining)
● (Mobile Computing)
Social Computing / Science w.r.t. to Social Context: Examples
Let‘s take a look at some examples of researchand reserachmethods in Social Computing and some societal issues regarding
social media and Social Computing research
now:
,
Social Computing / Science w.r.t. to Social Context: Examples
Let‘s take a look at some examples of researchand reserachmethods in Social Computing and some societal issues regarding
social media and Social Computing research
now:
,
Social Computing / Science w.r.t. to Social Context: Examples
Let‘s take a look at some examples of researchand reserachmethods in Social Computing and some societal issues regarding
social media and Social Computing research
now:
,
Social Computing / Science w.r.t. to Social Context: Examples
Let‘s take a look at some examples of researchand reserachmethods in Social Computing and some societal issues regarding
social media and Social Computing research
now:
,
Example 0: Collaborative Filtering, Social Filtering
Collaborative Filtering:
𝑟 =
4 − −− − −− 5 5
− − −5 − −− − −
1 2 −1 − −1 − 5
− 9 −− − −− 8 5
− 9 −− − 3− 6 −
− − −3 − −− − 5
1 − −− − −−−−
7−−
6−−
1 − 3− − −11−
6−−
−−4
− 2 −0 − −−9−
−4−
6−6
Example 0: Collaborative Filtering, Social Filtering
Collaborative Filtering:
𝑟 =
4 − −− − −− 5 5
− − −5 − −− − −
1 2 −1 − −1 − 5
− 9 −− − −− 8 5
− 9 −− − 3− 6 −
− − −3 − −− − 5
1 − −− − −−−−
7−−
6−−
1 − 3− − −11−
6−−
−−4
− 2 −0 − −−9−
−4−
6−6users
items
Example 0: Collaborative Filtering, Social Filtering
Collaborative Filtering:
𝑟 =
4 − −− − −− 5 5
− − −5 − −− ? −
1 2 −1 − −1 − 5
− 9 −− − −− 8 5
− 9 −− − 3− 6 −
− − −3 − −− − 5
1 − −− − −−−−
7−−
6−−
1 − 3− − −11−
6−−
−−4
− 2 −0 − −−9−
−4−
6−6users
items
item i
user u
Example 0: Collaborative Filtering, Social Filtering
Collaborative Filtering:
𝑟 =
4 − −− − −− 5 5
− − −5 − −− ? −
1 2 −1 − −1 − 5
− 9 −− − −− 8 5
− 9 −− − 3− 6 −
− − −3 − −− − 5
1 − −− − −−−−
7−−
6−−
1 − 3− − −11−
6−−
−−4
− 2 −0 − −−9−
−4−
6−6users
items
𝒩1 𝑢 : users that rated item i and that are similar to user u, e.g.:
𝒩1 𝑢 = 𝑣1 𝑠𝑖𝑚 𝑢,𝑣1 > 𝛼}where e.g. 𝑠𝑖𝑚 𝑢, 𝑣1 = cos 𝑢, 𝑣1 ~𝑢 ∗ 𝑣1
item i
user u
𝒩1(𝑢)
user u
𝒩1(𝑢)
Example 0: Collaborative Filtering, Social Filtering
Collaborative Filtering:
𝑟 =
4 − −− − −− 5 5
− − −5 − −− ? −
1 2 −1 − −1 − 5
− 9 −− − −− 8 5
− 9 −− − 3− 6 −
− − −3 − −− − 5
1 − −− − −−−−
7−−
6−−
1 − 3− − −11−
6−−
−−4
− 2 −0 − −−9−
−4−
6−6users
itemsnow: predicted rating foritem i of user u
item i
𝒩1 𝑢 : users that rated item i and that are similar to user u, e.g.:
𝒩1 𝑢 = 𝑣1 𝑠𝑖𝑚 𝑢,𝑣1 > 𝛼}where e.g. 𝑠𝑖𝑚 𝑢, 𝑣1 = cos 𝑢, 𝑣1 ~𝑢 ∗ 𝑣1
see e.g. [Desrosiers, C., & Karypis, 2011]
user u
𝒩1(𝑢)
Example 0: Collaborative Filtering, Social Filtering
Collaborative Filtering:
𝑟 =
4 − −− − −− 5 5
− − −5 − −− ? −
1 2 −1 − −1 − 5
− 9 −− − −− 8 5
− 9 −− − 3− 6 −
− − −3 − −− − 5
1 − −− − −−−−
7−−
6−−
1 − 3− − −11−
6−−
−−4
− 2 −0 − −−9−
−4−
6−6users
itemsnow: predicted rating foritem i of user u
item i
𝒩1 𝑢 : users that rated item i and that are similar to user u, e.g.:
𝒩1 𝑢 = 𝑣1 𝑠𝑖𝑚 𝑢,𝑣1 > 𝛼}where e.g. 𝑠𝑖𝑚 𝑢, 𝑣1 = cos 𝑢, 𝑣1 ~𝑢 ∗ 𝑣1
now: Social Filtering:
replace:rating similarity based 𝒩1 𝑢 and ratingsimilarities 𝑤DE of Collaborative Filtering with friends from social network and tie strengths
→ comparable or better results!→ social serendipity
see e.g. [Desrosiers & Karypis, 2011]
see e.g. [Groh & Ehmig, 2007]
Other Examples fromOur Research
[Perey, 2013]
● Social Information Retrieval
● Social Interaction Geometry
● AvailabilityManagement via Audio-‐BasedSocial Context
● Topical Social Influence
● Social Context and NLP
● Social Capital Management
● Privacy in Social Networking
● Sociotechnical Systems forHealthy Living
Machine Learning / Data-‐Mining
Goal: Find interesting patterns in large sets of data
extract
Find clusters, predict values, classify
train
Data
Patterns / Features
(probabilistic) Model
sensors
Data: Feature-‐/Pattern-‐Extraction Example I
“I like to dance samba, bake pizza, watch tv and plant trees in the garden. I also like to bake cakes.”
I 2 like 2 to 2 dance 1 samba 1bake 2 pizza 1 watch 1 tv 1and 1 plant 1 trees 1in 1 the 1garden 1 also 1cakes 1
Often: Instead of term-‐frequency (tf) alone: use term-‐frequency * inverse document frequency (idf);idf = log (#of docs where t occurs / #of docs)
● here: (abstract) sensor: download fromWeb, direct input, etc.
● feature extraction: tf-‐idf
Data: Feature-‐/Pattern-‐Extraction Example II
[Wikipedia 2016]
● sensor: cameraà images
● feature extractionà Eigenfaces
other example:
● sound à 30ms framesà FFT, filteringàMFCCs
Training Data
extract
Find clusters, predict values, classify
train
Data
Patterns / Features
(probabilistic) Model
sensors
cases:
● 𝑥1 GHIJ (unsupervised learning)
● (𝑥1 ,𝑦1) GHIJ (regression, supervised
learning)
● (𝑥1 ,𝑦1) GHIJ (classification, supervised
learning)
Training Data
extract
Find clusters, predict values, classify
train
Data
Patterns / Features
(probabilistic) Model
sensorscases:
● parametricmodels (GMMs, Random Forests, Linear Regression, SVMs, Neuronal Networks etc.) vs non-‐parametricmodels (KNN, DBScan etc.)
● probabilistic vs. non-‐probabilistic
● generative vs. discriminativemodels
● etc.
Linear Regression
[Bishop, 2005]
y y
y y
General Model:
Classification with KNN
[Bishop, 2005]
Classification: Decision Trees
[Bishop, 2005]
Breast Cancer Decision Tree
Classification: Logistic Regression
[Bishop, 2005]
Classification: Naive Bayes
𝑝 𝑥,𝑦 𝜃 = 𝑝 𝑥 𝑦, 𝜃 𝑝 𝑦 𝜃 =N𝑝 𝑥E 𝑦, 𝜃 𝑝 𝑦 𝜃O
EHI
Clustering: K-‐Means
[Bishop, 2005]
Clustering: GMMs
[Bishop, 2005]
𝑝(𝑥|𝜃)
(Deep) Neural Networks: Supervised Learning
[Bishop, 2005]
(Deep) Neural Networks: Unsupervised
[ML1, 2016]
Auto-‐Encoder Network
Example: Tripartite Graphs and Predicting Personality
Facebook Profile
Facebook User
Facebook Item
owns
likes
[Kosinski et al.2013]
Example: Tripartite Graphs and Predicting Personality
[Kosinski et al.2013]
Example: Tripartite Graphs and Predicting Personality
[Kosinski et al.2013]
pointed to via [Golbeck, 2013]
from
ALONE!!!
Example: Tripartite Graphs and Predicting Personality
[Kosinski et al.2013]
pointed to via [Golbeck, 2013]
from
ALONE!!!
Example: Privacy in Social Networking
[CBC, 2013] [Golbeck 2013]
-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐
Bibliography -‐-‐ Main -‐-‐
(1) Jennifer Golbeck: Two Sides of Profiling, keynote talk at SCA 2013, Karlsruhe, Germany, 2013
(2) Georg Groh: „Contextual Social Networking“, Habilitation thesis, TUM Informatics
(3) Kevin Murphy: Machine Learning: a Probabilistic Perspective, MIT Press
(4) Patrick van der Smagt, Georg Groh et al.: Material of Lecture Machine Learning I, TUM, 2013-‐2015
Bibliography -‐-‐ Further Citations -‐-‐
[Wikipedia 2013] Wikipedia article on “Social Media” http://en.wikipedia.org/wiki/Social_media (checked May 2013)[O’Reilly 2005] T. O’Reilly “What is Web2.0” (2005) http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-‐is-‐web-‐20.html (checked May 2013)[O’Reilly 2006] T.O’Reilly Web 2.0 Compact Definition: Trying Again http://radar.oreilly.com/archives/2006/12/web-‐20-‐compact-‐definition-‐tryi.html (checked May 2013)[Lermann 2007] Kristina Lerman (2007), Social Information Processing in Social News Aggregation, Extended version of the paper in IEEE Internet Computing special issue on Social Search 11(6), pp.16-‐28, 2007 http://www.isi.edu/~lerman/papers/lerman07ic.pdf (checked May 2013)[Open Social, 2012] (Google) Open Social Initiativehttp://opensocial.org/ (checked May 2013)[Peerson, 2013] Peerson P2P Social Networking Initiative http://www.peerson.net/ (checked May 2013)[Bizer, 2009] Bizer, C., Heath, T., & Berners-‐Lee, T. (2009). Linked data-‐the story so far. International Journal on Semantic Web and Information Systems (IJSWIS), 5(3), 1-‐22.http://eprints.soton.ac.uk/271285/1/bizer-‐heath-‐berners-‐lee-‐ijswis-‐linked-‐data.pdf (checked May 2013)[NN, 2013] http://winfwiki.wi-‐fom.de/index.php/Anwendungsm%C3%B6glichkeiten_von_Semantic_Web_in_sozialen_Netzen (checked May 2013)[SIOC, 2013] SIOC Project Website http://sioc-‐project.org (checked May 2013)[OPO, 2013] Online Presence Ontology Website http://online-‐presence.net (checked May 2013)
Bibliography -‐-‐ Further Citations -‐-‐
[iNCBEAT, 2013] iNCBEAT Websitehttp://www.incbeat.com/resources/web-‐technologies-‐businesses (checked October 2013)[SemanticFocus, 2013] Semantic Focus Website http://www.semanticfocus.com/blog/entry/title/introduction-‐to-‐the-‐semantic-‐web-‐vision-‐and-‐technologies-‐part-‐1-‐overview/ (checked October, 2013)[Perey, 2013] PereyWebsitehttp://www.perey.com/images/social_networking.jpg (checked October, 2013)[Desrosiers & Karypis 2011] Desrosiers, C., & Karypis, G. (2011). A comprehensive survey of neighborhood-‐based recommendation methods. In Recommender systems handbook (pp. 107-‐144). Springer US.[Groh & Ehmig, 2007] Georg Groh and Christian Ehmig. 2007. Recommendations in taste related domains: collaborative filtering vs. social filtering. In Proceedings of the 2007 international ACM conference on Supporting group work (GROUP '07). ACM, New York, NY, USA, 127-‐136. [URI, 2013] http://www.math.uri.edu/~merino/fall06/mth215/Adjacency.html (checked October, 2013)
[Granovetter, 1973] Mark Granovetter: The Strength of Weak Ties. In: American Journal of Sociology 78 (1973), S. 1360–1380.
[CBC, 2013] CBC News Articlehttp://www.cbc.ca/news/canada/montreal/depressed-‐woman-‐loses-‐benefits-‐over-‐facebook-‐photos-‐1.861843(checked October 2013)
[Groh 2012] Georg Groh: Contextual Social Networking, Habilitation thesis, TUM Informatics, 2012[Golbeck 2013] Jennifer Golbeck: Two Sides of Profiling, keynote talk at SCA 2013, Karlsruhe, Germany, 2013
[Kendon, 1990] Adam Kendon: Conducting Interaction: Patterns of Behavior in Focused Encounters, CUP Archive, 1990
[Kosinski et al.2013] Kosinski, M., Stillwell, D., & Graepel, T. (2013). Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences, 110(15), 5802-‐5805.
Bibliography -‐-‐ Further Citations -‐-‐
[Wikipedia, 2016] https://de.wikipedia.org/wiki/Datei:Eigenfaces.png (URL, 2016)[ML1, 2016] Machine Learning 1 Lecture TUM, 2015 / 2016[Bishop, 2005] C. Bishop: Pattern Recognition andMachine Learning, Springer 2005[Murphy, 2013] K. Murphy: Machine Learning: a Probabilistic Perspective, MIT Press, 2013