Our life relies on a wide variety
of software, particularly those on the web such as Google, Facebook,
Twitter, Amazon, etc. More and more data is published and shared
through the internet. These services gather, store, manipulate, and
serve a huge amount of information. This Information
Infrastructure course is one of the first steps to learn how
to gather, store, manipulate, and serve information using the Python programming language.
Specific objectives
- Gather: learn techniques to obtain data from the
web
- Store: learn basic database concepts and how to use
databases
- Manipulate: learn how to manipulate data stored in files
and databases
- Serve: learn basic concepts of the web protocol and how
to use web servers to serve data through web
- Syllabus
- Print this page. Note that the contents may be updated throughout
the course. Important updates will be announced.
- Time & Location
- Ballantine
Hall (BH) 304
Monday & Wednesday 4pm-5:15pm
First meeting:
Aug. 20th, 2012 (Monday)
- Labs
- LH 023
Thursday 1:00pm - 2:15pm (Andrew Hoffman)
Thursday 2:30pm - 3:45pm (Michael Keel and Paul Jenkins)
Thursday 4:00pm - 5:15pm (Paul Jenkins)
- Announcements
- You need to join the course mailing list to receive course
announcements. Visit
https://groups.google.com/d/forum/i211-2012f and join the
group. You need to set the email setting to
"Emails" (not abridged email or digest email) to receive
announcements immediately. All announcements will be made through
this mailing list. If you have any general question or would like to
share something, please send an email to i211-2012f@googlegroups.com
- Instructors
- You may send any question to the instructor mailing list: i211-2012f-instructors@googlegroups.com
Yong-Yeol (YY) Ahn (yyahn@indiana.edu)
Office: Informatics East Room 316
Phone: (812) 856 2920
Office hours: Monday 11am - noon
(you can
also schedule a meeting if
you cannot make for this time slot)
Chao Ji (jic@indiana.edu)
Office hours: Tuesday 10am - 12pm at LH
325
Andrew Hoffman (hoffmaae@indiana.edu)
Office hours: Tuesday 1pm - 2pm at Info
West 001
Michael Keel (mjkeel@indiana.edu)
Office hours: Tuesday 2pm - 3pm at Info West
001
Paul Jenkins (pauljenk@indiana.edu)
Office hours: Tuesday 3pm - 5pm at Info West
001
- Textbook
- No textbook is required. See Resources.
- Prerequisites
- It is assumed that you have taken either I210, C211, or
equivalent.
Week | Dates | Topics
| Course material and useful resources | Lab |
1 |
8/20, 22 |
Introduction, administrivia, and programming environments |
Python
Should I learn programming?
Command line interface
Internet
Editors (on servers)
Friendly editors for local machines
Python shell on the web
|
No labs |
2 |
8/27, 29 |
Basic data types |
Website example
Numbers, strings, lists, dictionaries, and variables
Further readings on dictionaries
Internet and the web
|
Assignment
1 (due: 9/4 11pm) |
9/3: No class (Labor day) |
3 |
9/5 |
Strings, files, and control statements |
Mutable vs. immutable
String formatting and manipulations
Files
| Assignment
2 (due: 9/11 11pm) |
4 |
9/10, 12 |
Functions and modules |
Built-in functions
Functions
More on functions and functional programming
Modules
Namespace
__name__
List comprehension
| Assignment
3 (due: 9/18 11pm) |
5 |
9/17, 19 |
Classes, OOP, error handling |
OOP
'self'
Bugs
More on debugging and testing
| Assignment
4 (due: 9/25 11pm) |
6 |
9/24, 26 |
OOP, Internet, and Web |
Programming languages. How to learn them.
Python modules
Scope
Errors and Exceptions
Internet
|
Assignment 4 continued; Midterm preparation |
7 |
10/1 |
Web and HTML; Review |
Hypertext
Markup language
Python httplib
| Assignment
5 (due: 10/9 11pm) |
10/3: Mid term exam |
8 |
10/8, 10 |
HTTP, Web crawler, and API |
HTTP
Web crawling
URLlib
XML
JSON
API
| Assignment
6 (due: 10/16 11pm) |
9 |
10/15, 17 |
Databases and SQL |
Databases
API
ETC.
| Assignment
7 (due: 10/23 11pm) |
10 |
10/22, 24 |
Database security, Encoding, MapReduce, and CGI |
Encoding
SQL injection
Map and Reduce
CGI
| Assignment
8 (due: 10/30 11pm) |
11 |
10/29, 31 |
CGI, Cron, NoSQL (MongoDB), and Regular expression |
SQL injection
CGI
Cron
MongoDB
Regular Expression
| Assignment
9 (due: 11/6 11pm) |
12 |
11/5, 7 |
Visualization, MVC framework |
Visualization
MVC framework
Encoding detection
Final project ideas
| Assignment
10 (due: 11/13 11pm) |
13 |
11/12, 14 |
Web frameworks, Cloud computing |
Web frameworks
Clouds
Linux permission, path, and executable scripts
|
Final project |
11/18-25: Thanksgiving |
15 |
11/26, 28 |
Web frameworks, cloud computing |
UML
AWS
Crowdsourcing
WSGI
|
Final project |
16 |
12/3, 5 |
Review |
|
Final project |
12/12: All late assignments due |
12/14: Final project due |
Class policies
- All announcements will be sent via the course mailing list.
Students are responsible for reading each email announcement in
detail.
- Please contact the instructor if you have a disability that require
some arrangements, or any other constraints that affect you, so
that appropriate arrangements can be made.
- The initial assignment submission should be done through
Oncourse. Make sure that the assignments are properly submitted
because Oncourse sometimes fails to receive your assignments. You are
responsible for ensuring the submission except in rare circumstances
of system failure.
- Late assignments: start early! All late
assignments should be sent to AIs through an email. During the first
week after the deadline, the score cap will be 80% (perfect solution
can get 80% of the points). The cap become 60% from the second week.
- You are responsible of backing up all their data and code. Today
is International
Backup Awareness Day! Use a backup drive, dropbox, or whatever service you
find it appropriate. For the students who are comfortable with
command line, I highly recommend using a version
control system, especially with hosting services such as github (IU provides a firewalled github), bitbucket, etc.
The aim of the final project is synthesizing various aspects of
information infrastructure that you have been studying throughout the
course to build an actual system that
gather, store, process, and
serve information and data. You can do either a standard project
or a challenge projet.
- Standard project
- You will build a twitter
sentiment analyzer.
- Challenge project
- You will design your own system. The only requirement is having
the four elements of information infrastructure. You should contact
the instructor with your ideas and designs.
There will be extra credits for incorporating extra elements in any
aspect of the system. Some examples: you can grab another data source
and perform interesting analyses by combining it with the twitter data.
You can put a google map and combine it with twitter location data. You
can do more textual analyses.
Academic integrity
The principles of academic honesty and ethics will be enforced. Any
cases of academic misconduct (cheating, fabrication, plagiarism, etc)
will be thoroughly investigated and immediately reported to the School
and the Dean of Students.
You should actively discuss with others, but you should write your own
code (report). Credit all the sources (discussion with other students,
used softwares, etc).
Grading policy
- Class participation 10%
- Assignments: 40%
- Mid-term exam: 20%
- Final project: 30%
Class participation mainly concerns attendance to the class, labs,
and in-class activities.
Mid-term exam will consist of multiple-choice and short answer
questions and they will be about course material we covered and lab
assignments.
You will get an F grade if you fail to submit the final project.
Books
Links
Past I211 courses
Relevant courses in other places