In recent years with the burgeoning amount of data available in digitized forms Big Data Analytics has led to new business opportunities as well as thrown up extremely challenging research problems. In synchrony with the rest of the world, competencies in large-scale data management and analytics have been developed in India in both academia and industry. Till date there have been very few academia-industry events that bring together researchers from both backgrounds in order to build a larger community that can benefit enormously from the synergy. There are many aspects to big data analytics and it is not often that all are highlighted in the same forum. One of the goals of this workshop is to highlight the different aspects of big data in different sessions.
2. Organization:The event will consist of 3 technical sessions, 2 invited talks, and a discussion session. In addition there will be student poster sessions over breaks. Each of the technical sessions will have a typically about an hour set aside for one or more talks followed by couple of hours of break out sessions discussing sub-topics related to the main theme of the session. Ideally these discussions should continue beyond the scheduled timings. The final half-day (Day 2, Session 2) will be devoted to a wrap up discussion session, during which all the session chairs will report on the discussions in the break out sessions, and we will try to come up with some action points / directions for the Indian research community at large for the near future.
3. Technical Sessions 3.1 Information extraction from online textual data Day 1, Session 1:Chair: L. V. Subramanian, IBM-IRL
Featured Speaker: Bing Liu, University of Illinois, Chicago.
Summary: Much of recent research in online data has been directed toward understanding content in order to provide more focused search results. This has lead to numerous techniques for deriving structured data from unstructured text and has lead to advances in many ancillary technologies like handling noisy textual data, semantic representations, information extraction from semi-structured data, etc. This session will look at the challenges in the area, recent advances and future applications.
3.2 Analytics on linked data Day 1, Session 2:Chair: Indrajit Bhattacharrya, IBM-IRL
Summary: Data generated by real systems usually has an underlying relational structure; similarly facts extracted from web-scale document collections, as well as data extracted from social networks, are often represented as `triples (e.g. RDF) codifying relations. In each of these cases such linked data is best interpreted as a graph. While traditional mining/ML ignored these dependencies for lack of models/computational capabilities, more recently researchers are developing algorithms and tools for handling linked (graph) data. Moreover, this setting has given raise to newer problems (such as link prediction) or newer interpretations of older problems (clustering/community detection). This session will focus on models and applications and paradigms that are unique to linked data.
3.3 Machine learning on large data sets Day 2, Session 1:Chair: B. Ravindran, IIT Madras
Featured Speaker: Srinivasan Parthasarathy, Ohio State.
Summary: The recent availability of large volumes of data has necessitated the development of new algorithms and a different mindset to learning from data. This session will look at issues ranging from distributed learning algorithms suited for map-reduce and other deployments, to advances in theoretical analysis of algorithms where we are more interested in time of execution and memory than optimal solutions.
4. Plenary Talks 4.1 Zoubin Ghahramani, Cambridge University Day 1, FN:Tentative topic: Information Extraction
4.2 V. S. Subrahmanian, University of Maryland, College Park Day 1, AN:Tentative topic: Tracking, monitoring and forecasting behaviors of global networks.
5. Student ParticipationStudents will be invited to submit a 2 page extended abstract of their work and approximately 10-15 will be shortlisted. The student participants will display posters of their work during sessions organized during breaks. In addition each student participant will make a five-minute spotlight presentation during a relevant technical session.
8:45 9:00 Welcome and Introduction to IKDD Gautam Shroff
9:00 9:50 Zoubin Ghahramani
9:50 10:20 Coffee Break (with student poster sessions)
10:20 1:00 Session 1: Information extraction from online textual data (Chair: L. V. Subramanian)
10:20-11:00 Bing Liu
11:00-11:20 Vasudeva Varma
11:20-11:40 Lipika Dey
11:40-1:00 Breakout Sessions
1:00 2:00 Lunch
2:00 5:10 Session 2: Analytics on linked data (Chair: Indrajit Bhattacharya)
2:00-2:40 Soumen Chakrabarti
2:40-3:00 Srikanta Bedathur
3:00-3:20 Sumeet Agarwal
3:20 3:50 Coffee Break (with student poster sessions)
3:50-5:10 Breakout sessions
5:10 6:00 V. S. Subrahmanian
7:00 8:00 IKDD Executive Team Meeting
8:30 9:00 Sponsor Speak
9:00 12:10 Session 3: Machine learning on large data sets (Chair: B. Ravindran)
9:00-9:40 Srinivasan Parthasarathy
9:40-10:00 P. S. Sastry
10:00-10:20 Sourangshu Bhattacharya
10:20 10:50 Coffee Break (with student poster sessions)
10:50-12:10 Breakout Sessions (3 out of these, to be voted on Day 1)
12:10 2:30 Working Lunch
Sum up of break out sessions Discussions on future directions
2:30 Conclusion