initial import

This commit is contained in:
Alper Kanat 2008-10-21 20:22:55 +00:00
commit 998d78bc1d
111 changed files with 34617 additions and 0 deletions

2
AUTHORS Normal file
View File

@ -0,0 +1,2 @@
Scott James Remnant <scott@netsplit.com>
Jeff Waugh <jdub@perkypants.org>

151
INSTALL Normal file
View File

@ -0,0 +1,151 @@
Installing Planet
-----------------
You'll need at least Python 2.1 installed on your system, we recommend
Python 2.3 though as there may be bugs with the earlier libraries.
Everything Pythonesque Planet needs should be included in the
distribution.
i.
First you'll need to extract the files into a folder somewhere.
I expect you've already done this, after all, you're reading this
file. You can place this wherever you like, ~/planet is a good
choice, but so's anywhere else you prefer.
ii.
Make a copy of the files in the 'examples' subdirectory, and either
the 'basic' or 'fancy' subdirectory of it and put them wherever
you like; I like to use the Planet's name (so ~/planet/debian), but
it's really up to you.
The 'basic' index.html and associated config.ini are pretty plain
and boring, if you're after less documentation and more instant
gratification you may wish to use the 'fancy' ones instead. You'll
want the stylesheet and images from the 'output' directory if you
use it.
iii.
Edit the config.ini file in this directory to taste, it's pretty
well documented so you shouldn't have any problems here. Pay
particular attention to the 'output_dir' option, which should be
readable by your web server and especially the 'template_files'
option where you'll want to change "examples" to wherever you just
placed your copies.
iv.
Edit the various template (*.tmpl) files to taste, a complete list
of available variables is at the bottom of this file.
v.
Run it: planet.py pathto/config.ini
You'll want to add this to cron, make sure you run it from the
right directory.
vi.
Tell us about it! We'd love to link to you on planetplanet.org :-)
Template files
--------------
The template files used are given as a space separated list in the
'template_files' option in config.ini. They are named ending in '.tmpl'
which is removed to form the name of the file placed in the output
directory.
Reading through the example templates is recommended, they're designed to
pretty much drop straight into your site with little modification
anyway.
Inside these template files, <TMPL_VAR xxx> is replaced with the content
of the 'xxx' variable. The variables available are:
name .... } the value of the equivalent options
link .... } from the [Planet] section of your
owner_name . } Planet's config.ini file
owner_email }
url .... link with the output filename appended
generator .. version of planet being used
date .... { your date format
date_iso ... current date and time in { ISO date format
date_822 ... { RFC822 date format
There are also two loops, 'Items' and 'Channels'. All of the lines of
the template and variable substitutions are available for each item or
channel. Loops are created using <TMPL_LOOP LoopName>...</TMPL_LOOP>
and may be used as many times as you wish.
The 'Channels' loop iterates all of the channels (feeds) defined in the
configuration file, within it the following variables are available:
name .... value of the 'name' option in config.ini, or title
title .... title retreived from the channel's feed
tagline .... description retreived from the channel's feed
link .... link for the human-readable content (from the feed)
url .... url of the channel's feed itself
Additionally the value of any other option specified in config.ini
for the feed, or in the [DEFAULT] section, is available as a
variable of the same name.
Depending on the feed, there may be a huge variety of other
variables may be available; the best way to find out what you
have is using the 'planet-cache' tool to examine your cache files.
The 'Items' loop iterates all of the blog entries from all of the channels,
you do not place it inside a 'Channels' loop. Within it, the following
variables are available:
id .... unique id for this entry (sometimes just the link)
link .... link to a human-readable version at the origin site
title .... title of the entry
summary .... a short "first page" summary
content .... the full content of the entry
date .... { your date format
date_iso ... date and time of the entry in { ISO date format
date_822 ... { RFC822 date format
If the entry takes place on a date that has no prior entry has
taken place on, the 'new_date' variable is set to that date.
This allows you to break up the page by day.
If the entry is from a different channel to the previous entry,
or is the first entry from this channel on this day
the 'new_channel' variable is set to the same value as the
'channel_url' variable. This allows you to collate multiple
entries from the same person under the same banner.
Additionally the value of any variable that would be defined
for the channel is available, with 'channel_' prepended to the
name (e.g. 'channel_name' and 'channel_link').
Depending on the feed, there may be a huge variety of other
variables may be available; the best way to find out what you
have is using the 'planet-cache' tool to examine your cache files.
There are also a couple of other special things you can do in a template.
- If you want HTML escaping applied to the value of a variable, use the
<TMPL_VAR xxx ESCAPE="HTML"> form.
- If you want URI escaping applied to the value of a variable, use the
<TMPL_VAR xxx ESCAPE="URI"> form.
- To only include a section of the template if the variable has a
non-empty value, you can use <TMPL_IF xxx>....</TMPL_IF>. e.g.
<TMPL_IF new_date>
<h1><TMPL_VAR new_date></h1>
</TMPL_IF>
You may place a <TMPL_ELSE> within this block to specify an
alternative, or may use <TMPL_UNLESS xxx>...</TMPL_UNLESS> to
perform the opposite.

84
LICENCE Normal file
View File

@ -0,0 +1,84 @@
Planet is released under the same licence as Python, here it is:
A. HISTORY OF THE SOFTWARE
==========================
Python was created in the early 1990s by Guido van Rossum at Stichting Mathematisch Centrum (CWI) in the Netherlands as a successor of a language called ABC. Guido is Python's principal author, although it includes many contributions from others. The last version released from CWI was Python 1.2. In 1995, Guido continued his work on Python at the Corporation for National Research Initiatives (CNRI) in Reston, Virginia where he released several versions of the software. Python 1.6 was the last of the versions released by CNRI. In 2000, Guido and the Python core development team moved to BeOpen.com to form the BeOpen PythonLabs team. Python 2.0 was the first and only release from BeOpen.com.
Following the release of Python 1.6, and after Guido van Rossum left CNRI to work with commercial software developers, it became clear that the ability to use Python with software available under the GNU Public License (GPL) was very desirable. CNRI and the Free Software Foundation (FSF) interacted to develop enabling wording changes to the Python license. Python 1.6.1 is essentially the same as Python 1.6, with a few minor bug fixes, and with a different license that enables later versions to be GPL-compatible. Python 2.1 is a derivative work of Python 1.6.1, as well as of Python 2.0.
After Python 2.0 was released by BeOpen.com, Guido van Rossum and the other PythonLabs developers joined Digital Creations. All intellectual property added from this point on, starting with Python 2.1 and its alpha and beta releases, is owned by the Python Software Foundation (PSF), a non-profit modeled after the Apache Software Foundation. See http://www.python.org/psf/ for more information about the PSF.
Thanks to the many outside volunteers who have worked under Guido's direction to make these releases possible.
B. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON
===============================================================
PSF LICENSE AGREEMENT
---------------------
1. This LICENSE AGREEMENT is between the Python Software Foundation ("PSF"), and the Individual or Organization ("Licensee") accessing and otherwise using Python 2.1.1 software in source or binary form and its associated documentation.
2. Subject to the terms and conditions of this License Agreement, PSF hereby grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, analyze, test, perform and/or display publicly, prepare derivative works, distribute, and otherwise use Python 2.1.1 alone or in any derivative version, provided, however, that PSF's License Agreement and PSF's notice of copyright, i.e., "Copyright (c) 2001 Python Software Foundation; All Rights Reserved" are retained in Python 2.1.1 alone or in any derivative version prepared by Licensee.
3. In the event Licensee prepares a derivative work that is based on or incorporates Python 2.1.1 or any part thereof, and wants to make the derivative work available to others as provided herein, then Licensee hereby agrees to include in any such work a brief summary of the changes made to Python 2.1.1.
4. PSF is making Python 2.1.1 available to Licensee on an "AS IS" basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 2.1.1 WILL NOT INFRINGE ANY THIRD PARTY RIGHTS.
5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON 2.1.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 2.1.1, OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
6. This License Agreement will automatically terminate upon a material breach of its terms and conditions.
7. Nothing in this License Agreement shall be deemed to create any relationship of agency, partnership, or joint venture between PSF and Licensee. This License Agreement does not grant permission to use PSF trademarks or trade name in a trademark sense to endorse or promote products or services of Licensee, or any third party.
8. By copying, installing or otherwise using Python 2.1.1, Licensee agrees to be bound by the terms and conditions of this License Agreement.
BEOPEN.COM TERMS AND CONDITIONS FOR PYTHON 2.0
----------------------------------------------
BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1
1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the Individual or Organization ("Licensee") accessing and otherwise using this software in source or binary form and its associated documentation ("the Software").
2. Subject to the terms and conditions of this BeOpen Python License Agreement, BeOpen hereby grants Licensee a non-exclusive, royalty-free, world-wide license to reproduce, analyze, test, perform and/or display publicly, prepare derivative works, distribute, and otherwise use the Software alone or in any derivative version, provided, however, that the BeOpen Python License is retained in the Software, alone or in any derivative version prepared by Licensee.
3. BeOpen is making the Software available to Licensee on an "AS IS" basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT INFRINGE ANY THIRD PARTY RIGHTS.
4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
5. This License Agreement will automatically terminate upon a material breach of its terms and conditions.
6. This License Agreement shall be governed by and interpreted in all respects by the law of the State of California, excluding conflict of law provisions. Nothing in this License Agreement shall be deemed to create any relationship of agency, partnership, or joint venture between BeOpen and Licensee. This License Agreement does not grant permission to use BeOpen trademarks or trade names in a trademark sense to endorse or promote products or services of Licensee, or any third party. As an exception, the "BeOpen Python" logos available at http://www.pythonlabs.com/logos.html may be used according to the permissions granted on that web page.
7. By copying, installing or otherwise using the software, Licensee agrees to be bound by the terms and conditions of this License Agreement.
CNRI OPEN SOURCE GPL-COMPATIBLE LICENSE AGREEMENT
-------------------------------------------------
1. This LICENSE AGREEMENT is between the Corporation for National Research Initiatives, having an office at 1895 Preston White Drive, Reston, VA 20191 ("CNRI"), and the Individual or Organization ("Licensee") accessing and otherwise using Python 1.6.1 software in source or binary form and its associated documentation.
2. Subject to the terms and conditions of this License Agreement, CNRI hereby grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, analyze, test, perform and/or display publicly, prepare derivative works, distribute, and otherwise use Python 1.6.1 alone or in any derivative version, provided, however, that CNRI's License Agreement and CNRI's notice of copyright, i.e., "Copyright (c) 1995-2001 Corporation for National Research Initiatives; All Rights Reserved" are retained in Python 1.6.1 alone or in any derivative version prepared by Licensee. Alternately, in lieu of CNRI's License Agreement, Licensee may substitute the following text (omitting the quotes): "Python 1.6.1 is made available subject to the terms and conditions in CNRI's License Agreement. This Agreement together with Python 1.6.1 may be located on the Internet using the following unique, persistent identifier (known as a handle): 1895.22/1013. This Agreement may also be obtained from a proxy server on the Internet using the following URL: http://hdl.handle.net/1895.22/1013".
3. In the event Licensee prepares a derivative work that is based on or incorporates Python 1.6.1 or any part thereof, and wants to make the derivative work available to others as provided herein, then Licensee hereby agrees to include in any such work a brief summary of the changes made to Python 1.6.1.
4. CNRI is making Python 1.6.1 available to Licensee on an "AS IS" basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT INFRINGE ANY THIRD PARTY RIGHTS.
5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON 1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1, OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
6. This License Agreement will automatically terminate upon a material breach of its terms and conditions.
7. This License Agreement shall be governed by the federal intellectual property law of the United States, including without limitation the federal copyright law, and, to the extent such U.S. federal law does not apply, by the law of the Commonwealth of Virginia, excluding Virginia's conflict of law provisions. Notwithstanding the foregoing, with regard to derivative works based on Python 1.6.1 that incorporate non-separable material that was previously distributed under the GNU General Public License (GPL), the law of the Commonwealth of Virginia shall govern this License Agreement only as to issues arising under or with respect to Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this License Agreement shall be deemed to create any relationship of agency, partnership, or joint venture between CNRI and Licensee. This License Agreement does not grant permission to use CNRI trademarks or trade name in a trademark sense to endorse or promote products or services of Licensee, or any third party.
8. By clicking on the "ACCEPT" button where indicated, or by copying, installing or otherwise using Python 1.6.1, Licensee agrees to be bound by the terms and conditions of this License Agreement.
ACCEPT
CWI PERMISSIONS STATEMENT AND DISCLAIMER
----------------------------------------
Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam, The Netherlands. All rights reserved.
Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation, and that the name of Stichting Mathematisch Centrum or CWI not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission.
STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

4
NEWS Normal file
View File

@ -0,0 +1,4 @@
Planet 1.0
----------
* First release!

10
PKG-INFO Normal file
View File

@ -0,0 +1,10 @@
Metadata-Version: 1.0
Name: planet
Version: nightly
Summary: The Planet Feed Aggregator
Home-page: http://www.planetplanet.org/
Author: Planet Developers
Author-email: devel@lists.planetplanet.org
License: Python
Description: UNKNOWN
Platform: UNKNOWN

12
README Normal file
View File

@ -0,0 +1,12 @@
Planet
------
Planet is a flexible feed aggregator. It downloads news feeds published by
web sites and aggregates their content together into a single combined feed,
latest news first.
It uses Mark Pilgrim's Universal Feed Parser to read from RDF, RSS and Atom
feeds; and Tomas Styblo's templating engine to output static files in any
format you can dream up.
Keywords: feed, blog, aggregator, RSS, RDF, Atom, OPML, Python

18
THANKS Normal file
View File

@ -0,0 +1,18 @@
Patches and Bug Fixes
---------------------
Chris Dolan - fixes, exclude filtering, duplicate culling
David Edmondson - filtering
Lucas Nussbaum - locale configuration
David Pashley - cache code profiling and recursion fixing
Gediminas Paulauskas - days per page
Spycyroll Maintainers
---------------------
Vattekkat Satheesh Babu
Richard Jones
Garth Kidd
Eliot Landrum
Bryan Richard

22
TODO Normal file
View File

@ -0,0 +1,22 @@
TODO
====
* Expire feed history
The feed cache doesn't currently expire old entries, so could get
large quite rapidly. We should probably have a config setting for
the cache expiry, the trouble is some channels might need a longer
or shorter one than others.
* Allow display normalisation to specified timezone
Some Planet admins would like their feed to be displayed in the local
timezone, instead of UTC.
* Support OPML and foaf subscriptions
This might be a bit invasive, but I want to be able to subscribe to OPML
and FOAF files, and see each feed as if it were subscribed individually.
Perhaps we can do this with a two-pass configuration scheme, first to pull
the static configs, second to go fetch and generate the dynamic configs.
The more I think about it, the less invasive it sounds. Hmm.

491
gezegen/config.ini Normal file
View File

@ -0,0 +1,491 @@
[Planet]
name = Linux Gezegeni
link = http://gezegen.linux.org.tr
owner_name = Gezegen Ekibi
owner_email = gezegen@linux.org.tr
cache_directory = cache
new_feed_items = 1
log_level = DEBUG
template_files = gezegen/index.html.tmpl gezegen/rss20.xml.tmpl gezegen/rss10.xml.tmpl gezegen/opml.xml.tmpl gezegen/foafroll.xml.tmpl
output_dir = www/
# items_per_page = 15
items_per_page = 25
days_per_page = 0
feed_timeout = 15
encoding = utf-8
locale = tr_TR.UTF-8
date_format = %d %b %Y @ %I:%M %p
#date_format = %B %d, %Y %I:%M %p
new_date_format = %d %B %Y
[DEFAULT]
facewidth = 64
faceheight = 64
#[http://feeds.feedburner.com/gl]
#name = Gezegen Ekibi
#face = gezegenekibi.png
[http://ahmet.pardusman.org/blog/feed/?cat=2]
name = Ahmet Aygün
face = ahmetaygun.png
#email = ahmet@pardusman.org
#jabber = aa@jabber.uludag.org.tr
#[http://arda.pardusman.org/blog/tag/gezegen/feed/]
#name = Arda Çetin
#face = ardacetin.png
#email = arda@pardusman.org
#jabber = arda@jabber.uludag.org.tr
#12 Nisan 2007'de rss adresi degisti. DG.
#Eskisi : http://cekirdek.pardus.org.tr/~meren/blog/rss.cgi]
[http://cekirdek.pardus.org.tr/~meren/blog/feed/rss/]
name = A. Murat Eren
face = meren.png
#email = meren@pardus.org.tr
#jabber = meren@jabber.uludag.org.tr
[http://www.ademalpyildiz.com.tr/feed/]
name = Adem Alp Yıldız
#email = ademalp@linux-sevenler.org / ben@ademalpyildiz.com.tr
#jabber = ademalp@linux-sevenler.org
[http://www.erdinc.info/?cat=6&feed=rss2]
name = Ali Erdinç Köroğlu
face = alierdinckoroglu.png
#email = erdinc@pardus.org.tr
#jabber = erdinc@jabber.uludag.org.tr
# Gezegen'de gorugumuz yazisi uzerine cikartildi. DG, 12 Nisan 2007
# http://burkinafasafiso.com/2007/04/12/gezegene-elveda/
#[http://www.burkinafasafiso.com/category/acik-kaynak/feed/]
#name = Ali Işıngör
#email = isingor@gmail.com
#jabber = aisingor@jabber.uludag.org.tr
[http://feeds.feedburner.com/raptiye/acikkaynak/]
name = Alper Kanat
face = alperkanat.png
#email = tunix@raptiye.org
#jabber = tunix@jabber.org
[http://www.murekkep.org/wp-feed.php?category_name=bilisim&author_name=admin&feed=rss2]
name = Alper Orus
#email = alperor@linux-sevenler.org
#jabber =
[http://armish.linux-sevenler.org/blog/category/gezegen/feed]
name = Arman Aksoy
face = armanaksoy.png
#email = armish@linux-sevenler.org
#jabber =
[http://www.metin.org/gunluk/feed/rss/]
name = Barış Metin
face = barismetin.png
#email = baris@pardus.org.tr
#jabber = baris@jabber.uludag.org.tr
[http://www.tuxworkshop.com/blog/?feed=rss2]
name = Barış Özyurt
face = barisozyurt.png
#email = baris@tuxworkshop.com
#jabber = barisozyurt@jabber.uludag.org.tr
[http://feeds.feedburner.com/canburak-gezegen-linux]
name = Can Burak Çilingir
#email = can@canb.net
#jabber = can@canb.net
[http://cankavaklioglu.name.tr/guncelgunce/archives/linux/index-rss.xml]
name = Can Kavaklıoğlu
#email = eposta@cankavaklioglu.name.tr
#jabber =
[http://blog.gunduz.org/index.php?/feeds/categories/7-OEzguer-Yazlm.rss]
name = Devrim Gündüz
face = devrimgunduz.png
#email = devrim@gunduz.org
#jabber =
[http://zzz.fisek.com.tr/seyir-defteri/wp-rss2.php?cat=3]
name = Doruk Fişek
face = dorukfisek.png
#email = dfisek@fisek.com.tr
#jabber = dfisek@jabber.fisek.com.tr
[http://ekin.fisek.com.tr/blog/wp-rss2.php?cat=5]
name = Ekin Meroğlu
face = ekinmeroglu.png
#email = ekin@fisek.com.tr
#jabber = ekin@jabber.fisek.com.tr
#[http://aylinux.blogspot.com/atom.xml]
#name = Emre Karaoğlu
#email = emre@uzem.itu.edu.tr
#jabber =
[http://feeds.feedburner.com/TheUselessJournalV4]
name = Erçin Eker
face = ercineker.png
#email = erc.caldera@gmx.net
#jabber = ercineker@jabber.org
[http://enveraltin.com/blog?flav=rss]
name = Enver Altın
#email = ealtin@parkyeri.com / skyblue@skyblue.gen.tr
#jabber = ea@enveraltin.com
[http://www.erhanekici.com/blog/category/linux/rss]
name = Erhan Ekici
#email = erhan@uzem.itu.edu.tr / erhan.ekici@gmail.com
#jabber =
#Kendi istedigi uzerine cikarildi 180707
#[http://cekirdek.pardus.org.tr/~tekman/zangetsu/blog/feed/rss/Linux]
#name = Erkan Tekman
#face = erkantekman.png
#email = tekman@pardus.org.tr
#jabber = tekman@jabber.uludag.org.tr
#[http://ileriseviye.org/blog/?feed=rss2]
#name = Emre Sevinç
#email = emres@bilgi.edu.tr
#jabber =
[http://www.faikuygur.com/blog/feed/?cat=-4]
name = Faik Uygur
face = faikuygur.png
#email = faik@pardus.org.tr
#jabber = faik@jabber.uludag.org.tr
[http://blog.arsln.org/category/gezegen/feed]
name = Fatih Arslan
#email = fatih@arsln.org
#jabber =
[http://cekirdek.pardus.org.tr/~gokmen/zangetsu/blog/feed/rss/Gezegen/]
name = Gökmen Göksel
face = gokmengoksel.png
#email = gokmen@pardus.org.tr
#jabber = gokmen.goksel@jabber.uludag.org.tr
[http://6kere9.com/blag/feed/rss/Genel/]
name = Gürer Özen
face = gurerozen.png
#email = gurer@pardus.org.tr
#jabber = gurer@jabber.uludag.org.tr
[http://www.hakanuygun.com/blog/?feed=atom&cat=13]
name = Hakan Uygun
#email = hakan.uygun@linux.org.tr
#jabber =
[http://www.huseyinuslu.net/topics/linux/feed]
name = Hüseyin Uslu
face = huseyinuslu.png
#email = shalafiraistlin@gmail.com
#jabber =
#03/07/2007 Devrim Vasıtası ile çıkmak istedi
#[http://cekirdek.pardus.org.tr/~ismail/blog/rss.cgi]
#name = İsmail Dönmez
#face = ismaildonmez.png
#email = ismail@pardus.org.tr
#jabber = ismail@jabber.uludag.org.tr
[http://www.koray.org/blog/wp-rss2.php?cat=7]
name = Koray Bostancı
#email = koray@kde.org.tr
#jabber = koraybostanci@jabber.org
#09/08/2007 tarihinde kendisi silinmesini istedi.
#[http://cekirdek.pardus.org.tr/~loker/zangetsu/blog/feed/rss/Pardus/]
#name = Koray Löker
#face = korayloker.png
#email = loker@pardus.org.tr
#jabber = loker@jabber.uludag.org.tr
[http://marenostrum.blogsome.com/category/gezegen/feed/]
name = K. Deniz Öğüt
face = kdenizogut.png
#email: kdenizogut@gmail.com
#jabber: kdenizogut@jabber.org
[http://www.blockdiagram.net/blog/rss.xml]
name = Kerem Can Karakaş
#email:
#jabber: blokdiyagram@gmail.com
[http://www.kuzeykutbu.org/blog/?feed=rss2&cat=3]
name = Kaya Oğuz
face = kayaoguz.png
#email = kaya@kuzeykutbu.org
#jabber = kaya@jabber.fisek.com.tr
[http://leoman.gen.tr/blg/category/lkd-gezegen/feed]
name = Levent Yalçın
#email = leoman@leoman.gen.tr
#jabber =
[http://blog.corporem.org/?feed=rss2&cat=3]
name = M.Tuğrul Yılmazer
face = tugrulyilmazer.png
#email = corporem@corporem.org
#jabber = tugrul@jabber.org
[http://www.sonofnights.com/category/turkce/linux/feed]
name = Mehmet Büyüközer
#email = mbuyukozer@yahoo.com
#jabber =
[http://mhazer.blogspot.com/feeds/posts/default/-/gezegen]
name = Murat Hazer
#email = murathazer@gmail.com
#jabber =
#12052008 RSS ulasilmiyor
#[http://mail.kivi.com.tr/blog/wp-rss2.php]
#name = Murat Koç
#email = muratkoc@kivi.com.tr
#jabber =
[http://web.inonu.edu.tr/~mkarakaplan/blog/wp-rss2.php]
name = Mustafa Karakaplan
#email = mkarakaplan@inonu.edu.tr
#jabber =
[http://panhaema.com/rss.php?mcat=linux]
name = Murat Sağlam
face = muratsaglam.png
#email = benimkaosum@hotmail.com
#jabber = darkhunter@jabber.org
[http://mmakbas.wordpress.com/tag/gezegen/feed/]
name = M.Murat Akbaş
#email = mma@aa.com.tr
#jabber =
#[http://demir.web.tr/blog/atom.php] Atom patladı rss deneyelim
[http://feeds.feedburner.com/ndemirgezegen]
name = Necati Demir
face = necatidemir.png
#email = ndemir@demir.web.tr
#jabber =
[http://nyucel.blogspot.com/feeds/posts/default/-/gezegen]
name = Necdet Yücel
face = necdetyucel.png
#email = nyucel@comu.edu.tr
#jabber = nyucel@jabber.uludag.org.tr
[http://www.yalazi.org/wp/index.php/feed/]
name = Onur Yalazı
face = onuryalazi.png
#email = onur@yalazi.org
#jabber = yalazi@jabber.parkyeri.com
[http://feeds.feedburner.com/oguzy-gezegen]
name = Oğuz Yarımtepe
face = oguzyarimtepe.png
#email = oguzy@comu.edu.tr
#jabber = oguzy@jabber.uludag.org.tr
[http://bilisimlab.com/blog/rss.php]
name = Ömer Fadıl Usta
#email = usta@bilisimlab.com
#jabber =
[http://feeds.feedburner.com/pinguar]
name = Pınar Yanardağ
face = pinaryanardag.png
#email = pinar@comu.edu.tr
#jabber = pinguar@12jabber.com
[http://nightwalkers.blogspot.com/atom.xml]
name = Serbülent Ünsal
#email = serbulentu@gmail.com
#jabber =
[http://blogs.lkd.org.tr/seminercg/index.php?/feeds/categories/2-Seminer.rss]
name = LKD Seminer Duyuruları
face = seminercg.png
#email = seminer@linux.org.tr
[http://www.serveracim.net/serendipity/index.php?/feeds/index.rss2]
name = Server Acim
face = serveracim.png
#email = sacim@kde.org.tr
#jabber =
[http://www.ayder.org/gunluk/?feed=rss2]
name = Sinan Alyürük
#email = sinan.alyuruk@linux.org.tr
#jabber =
[http://talat.uyarer.com/?feed=rss2]
name= Talat Uyarer
#email = talat@uyarer.com
#jabber= uyarertalat@gmail.com
[http://tonguc.name/blog/?flav=rss]
name = Tonguç Yumruk
face = tongucyumruk.png
#email = tongucyumruk@fazlamesai.net
#jabber = tonguc@linux-sevenler.org
[http://sehitoglu.web.tr/gunluk/?feed=rss2&cat=12]
name = Onur Tolga Şehitoğlu
#email = onur@ceng.metu.edu.tr
#jabber = onursehitoglu@jabber.org
#12052008 RSS e ulasilmiyor
#[http://ergenoglu.org/blog/?feed=rss2]
#name = Üstün Ergenoğlu
#email = ustun.ergenoglu@gmail.com
#jabber = ustun.ergenoglu@gmail.com
[http://handlet.blogspot.com/atom.xml]
name = Ümran Kamar
face = umrankamar.png
#email = umrankamar@gmail.com
#jabber = umran@jabber.org
[http://00101010.info/konu/teknik.rss]
name = Recai Oktaş
#email = roktas@debian.org
#jabber =
#21052007 Bu adresde kimse yok..
#[http://geekshideout.blogspot.com/feeds/posts/default]
#name = Mehmet Erten
#face = mehmeterten.png
#email = mehmeterten@gawab.com
[http://www.bugunlinux.com/?feed=rss2]
name = Ahmet Yıldız
#face =
#email = ahmyildiz@gmail.com
#gecici olarak uzaklastirildi kufur ettigi icin
#[http://ish.kodzilla.org/blog/?feed=rss2&cat=4]
#name = İşbaran Akçayır
#face =
#email = isbaran@gmail.com
[http://feeds.feedburner.com/SerkanLinuxGezegeni]
name = Serkan Altuntaş
#face =
#email = serkan@serkan.gen.tr
[http://www.furkancaliskan.com/blog/category/gezegen/feed]
name = Furkan Çalışkan
#face =
#email = caliskanfurkan@gmail.com
[http://eumur.wordpress.com/feed]
name = Umur Erdinç
#face =
#email = umurerdinc@gmail.com
[http://blogs.lkd.org.tr/penguencg/index.php?/feeds/index.rss2]
name = Penguen-CG
#face =
#email =
[http://serkank.wordpress.com/feed/]
name = Serkan Kaba
face = serkankaba.png
#email = serkan_kaba@yahoo.com
#[http://www.alpersomuncu.com/blog/category/linux/feed]
[http://www.alpersomuncu.com/blog/index.php?/feeds/categories/8-Linux.rss]
name = Alper Somuncu
face = alpersomuncu.png
#email = Alper Somuncu <alpersomuncu@gmail.com>
[http://blogs.lkd.org.tr/standcg/index.php?/feeds/index.rss2]
name = Stand
#face = alpersomuncu.png
#email = stand-cg@linux.org.tr
[http://feeds.feedburner.com/nesimia-gezegen?format=xml]
name = Nesimi Acarca
#face = alpersomuncu.png
#email = nesimiacarca@gmail.com
[http://www.soyoz.com/gunce/etiket/linux-gezegeni/rss]
name = Erol Soyöz
#face = alpersomuncu.png
#email = erol@soyoz.com
[http://gurcanozturk.com/feed/]
name = Gürcan Öztürk
#face = alpersomuncu.png
#email = gurcan@gurcanozturk.com
[http://www.python-tr.com/feed/atom/]
name = Python-TR
#face = alpersomuncu.png
#email = ugursamsa@ugurs.com
[http://www.ozgurlukicin.com/rss/haber]
name = Özgürlükiçin.com
#face = alpersomuncu.png
#email = ugursamsa@ugurs.com
[http://gunluk.lkd.org.tr/webcg/feed]
name = Web-CG
#face = alpersomuncu.png
#email = bahri@bahri.info
[http://www.bahri.info/category/linux/feed]
name = Bahri Meriç Canlı
#face = alpersomuncu.png
#email = bahri@bahri.info
[http://blogs.portakalteknoloji.com/bora/blog/feed/rss/]
name = Bora Güngören
#face = alpersomuncu.png
#email = bora@boragungoren.com
#010608 gecici sure ile durduruldu
#[http://www.ozgurkaratas.com/index.php/feed/]
#name = Özgür Karataş
#face = alpersomuncu.png
#email = okaratas@ogr.iu.edu.tr
[http://www.kirmizivesiyah.org/index.php/category/gezegen/feed/]
name = Kubilay Onur Güngör
#face = alpersomuncu.png
#email = ko.gungor@gmail.com
[http://gunluk.lkd.org.tr/yk/feed/?cat=5]
name = LKD YK
#face = alpersomuncu.png
#email = ko.gungor@gmail.com
[http://flyeater.wordpress.com/tag/lkd/feed]
name = Deniz Koçak
#face = alpersomuncu.png
#email = deniz.kocak@gmail.com
[http://serkan.feyvi.org/blog/category/debian/feed]
name = Serkan Kenar
#email = serkankenar@gmail.com
[http://armuting.blogspot.com/feeds/posts/default/-/%C3%B6i_gezegen]
name = Ali Erkan İMREK
#email = alierkanimrek@gmail.com
[http://www.lkd.org.tr/news/aggregator/RSS]
name = LKD.org.tr
#email = web@linux.org.tr
[http://gunluk.lkd.org.tr/ftp/feed/]
name = FTP ekibi
#email = ftp@linux.org.tr

31
gezegen/foafroll.xml.tmpl Normal file
View File

@ -0,0 +1,31 @@
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:rss="http://purl.org/rss/1.0/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
>
<foaf:Group>
<foaf:name><TMPL_VAR name></foaf:name>
<foaf:homepage><TMPL_VAR link ESCAPE="HTML"></foaf:homepage>
<rdfs:seeAlso rdf:resource="<TMPL_VAR uri ESCAPE="HTML">" />
<TMPL_LOOP Channels>
<foaf:member>
<foaf:Agent>
<foaf:name><TMPL_VAR name></foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="<TMPL_VAR link ESCAPE="HTML">">
<dc:title><TMPL_VAR title></dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="<TMPL_VAR uri ESCAPE="HTML">" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
</TMPL_LOOP>
</foaf:Group>
</rdf:RDF>

161
gezegen/index.html.tmpl Normal file
View File

@ -0,0 +1,161 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml/transitional.dtd">
<html>
<head>
<title><TMPL_VAR name></title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<link rel="stylesheet" href="http://gezegen.linux.org.tr/generic.css" type="text/css" />
<link rel="stylesheet" href="http://gezegen.linux.org.tr/layout.css" type="text/css" />
<link rel="stylesheet" href="http://gezegen.linux.org.tr/planet.css" type="text/css" />
<link rel="stylesheet" href="http://gezegen.linux.org.tr/bloggers.css" type="text/css" />
<!--
FIXME: add favicon
<link rel="icon" type="image/png" href="images/logo.png" />
<link rel="shortcut icon" type="image/png" href="images/logo.png" />
-->
<link rel="alternate" type="application/rss+xml" title="<TMPL_VAR name>" href="http://gezegen.linux.org.tr/rss20.xml" />
</head>
<body>
<div id="hdr">
<div id="banner"><img src="http://gezegen.linux.org.tr/images/spacer.png" alt="spacer" /></div>
<div id="logo"><a href="http://gezegen.linux.org.tr/"><img src="http://gezegen.linux.org.tr/images/spacer.png" alt="Anasayfa" /></a></div>
<!--
<div id="hdrNav">
<a href="http://gezegenlinux.blogspot.com/">Linux Gezegeni Haberleri</a>
</div>-->
<!--
<div id="hdrNav">
<center><a href="http://www.cebitbilisim.com/tr" target="_blank"><img src="http://gezegen.linux.org.tr/images/banner2006-tr.gif" border="0"></a></center>
</div>
-->
</div>
<div id="body">
<TMPL_LOOP Items>
<TMPL_IF new_date>
<h2 class="date"><TMPL_VAR new_date></h2>
</TMPL_IF>
<div class="entry <TMPL_IF channel_nick><TMPL_VAR channel_nick></TMPL_IF>">
<div class="person-info">
<a href="<TMPL_VAR channel_link ESCAPE="HTML">" title="<TMPL_VAR channel_title ESCAPE="HTML">">
<TMPL_IF channel_face>
<img class="face" src="http://gezegen.linux.org.tr/images/heads/<TMPL_VAR channel_face ESCAPE="HTML">" title="<TMPL_VAR channel_name>" /><br />
<TMPL_ELSE>
<img class="face" src="http://gezegen.linux.org.tr/images/heads/nobody.png" title="<TMPL_VAR channel_name>" /><br />
</TMPL_IF>
<TMPL_VAR channel_name><TMPL_IF channel_nick><br />(<TMPL_VAR channel_nick>)</TMPL_IF>
</a>
</div>
<div class="post">
<div class="post2">
<div class="post-header">
<TMPL_IF title>
<h4 class="post-title"><a href="<TMPL_VAR link ESCAPE="HTML">"><TMPL_VAR title></a></h4>
<TMPL_ELSE>
<div class="post-title"><span>&nbsp;</span></div>
</TMPL_IF>
</div>
<br />
<div class="post-contents">
<TMPL_VAR content>
<br />
<br />
<div id="post-links" style="text-align: center;">
<TMPL_IF comments><a href="<TMPL_VAR comments ESCAPE="HTML">"><img src="images/yorum.png" border="0" title="Yorumlar" /></a></TMPL_IF>
<a href="http://del.icio.us/post?url=<TMPL_VAR link ESCAPE="HTML">&title=<TMPL_VAR title ESCAPE="HTML">" target="_blank"><img src="images/delicious.png" border="0" title="del.icio.us'a gönder" /></a>
<a href="http://technorati.com/search/<TMPL_VAR link ESCAPE="HTML">" target="_blank"><img src="images/technorati.png" border="0" title="technorati'de ara" /></a>
</div>
</div>
<div class="post-footer">
<p><a href="<TMPL_VAR link ESCAPE="HTML">"><TMPL_VAR date></a></p>
</div>
</div>
</div>
</div>
</TMPL_LOOP>
</div>
<div id="sidebar">
<div class="section">
<h3>Gezegen Hakkında</h3>
<p>Linux Gezegeni, Türkiye'de Linux ve Özgür Yazılım konusunda çalışmalar yapan arkadaşlarımızın web üzerindeki günlüklerini bir tek sayfadan okumamızı ve kendi dünyalarına ulaşmamızı sağlayan basit bir web sitesidir.</p>
<p>Gezegeni <a href="http://www.planetplanet.org/">Planet</a> ile oluşturuyoruz, tasarım <a href="http://www.actsofvolition.com/">Steven Garrity</a>'nin eseri.</p>
</div>
<div class="section">
<a href='http://reklam.lkd.org.tr/www/delivery/ck.php?n=a78599b7&amp;cb=INSERT_RANDOM_NUMBER_HERE' target='_blank'><img
src='http://reklam.lkd.org.tr/www/delivery/avw.php?zoneid=2&amp;cb=INSERT_RANDOM_NUMBER_HERE&amp;n=a78599b7' border='0' alt='' /></a>
</div>
<div class="bloggers section" id="bloggers">
<h3>Üyeler</h3>
<ul>
<TMPL_LOOP Channels>
<li>
<div>
<TMPL_IF face><img class="head" src="images/heads/<TMPL_VAR face ESCAPE="HTML">" title="<TMPL_VAR face>" />
<TMPL_ELSE><img class="head" src="images/heads/nobody.png" title="<TMPL_VAR channel_name>" /></TMPL_IF>
<div class="ircnick">&nbsp;</div>
</div>
<a href="<TMPL_VAR url ESCAPE="HTML">" title="subscribe"><img src="images/feed-icon-10x10.png" alt="(feed)"></a>
<a <TMPL_IF link>href="<TMPL_VAR link ESCAPE="HTML">" </TMPL_IF><TMPL_IF message>class="message" title="<TMPL_VAR message ESCAPE="HTML">"</TMPL_IF><TMPL_UNLESS message>title="<TMPL_VAR title_plain ESCAPE="HTML">"</TMPL_UNLESS>><TMPL_VAR name></a>
</li>
</TMPL_LOOP>
</ul>
</div>
<div class="section">
<h3>Takip edin</h3>
<ul>
<li><a href="http://gezegen.linux.org.tr/rss20.xml">RSS 2.0</a></li>
<li><a href="http://gezegen.linux.org.tr/rss10.xml">RSS 1.0</a></li>
<li><a href="http://gezegen.linux.org.tr/foafroll.xml">FOAF</a></li>
<li><a href="http://gezegen.linux.org.tr/opml.xml">OPML</a></li>
</ul>
</div>
<div class="section">
<h3>Diğer Gezegenler</h3>
<ul>
<li><a href="http://gezegen.pardus.org.tr/">Pardus</a></li>
<li><a href="http://www.kernelplanet.org/">Kernel</a></li>
<li><a href="http://www.planetkde.org/">KDE</a></li>
<li><a href="http://planet.gnome.org">Gnome</a></li>
<li><a href="http://www.planetsuse.org/">SuSE</a></li>
<li><a href="http://planet.python.org">Python</a></li>
<li><a href="http://planet.gentoo.org">Gentoo</a></li>
<li><a href="http://www.go-mono.com/monologue/">MONOlogue</a></li>
<li><a href="http://planetjava.org">Java</a></li>
<li><a href="http://planet.lisp.org">LISP</a></li>
<li><a href="http://planet.perl.org">Perl</a></li>
<li><a href="http://fedoraproject.org/people/">Fedora</a></li>
</ul>
</div>
<div class="section">
<h3>Güncelleme</h3>
<p>Gezegen her 10 dakikada bir yenilenir.</p>
<p>Son güncelleme: <br /><TMPL_VAR date></p>
</div>
<div class="section">
<h3>İletişim</h3>
<p>Linux Gezegeni <a href="mailto:gezegen [at] linux.org.tr">Gezegen Ekibi</a> tarafından yönetilmektedir, Gezegen hakkındaki sorularınızı ve Gezegen'e iniş başvurularınızı e-posta ile iletebilirsiniz.</p>
</div>
</div>
<div id="copyright">
Bu sayfa içerisinde yazılanlar doğru veya yanlış herhangi bir biçimde <a href="http://www.lkd.org.tr/">Linux Kullanıcıları Derneği</a>'ni bağlamaz. <br />
LKD yalnızca Linux Gezegeni için teknik olanakları (sunucu, yazılım, bant genişliği) sağlar.<br />
Ayrıca Gezegen istatistiklere <a href="http://gezegen.linux.org.tr/stats">buradan</a> ulaşabilirsiniz.<br />
<!-- Start of StatCounter Code -->
<a href="http://www.statcounter.com/" target="_blank"><img src="http://c18.statcounter.com/counter.php?sc_project=1860933&java=0&security=e27e04a9&invisible=0" alt="free tracking" border="0"></a>
<!-- End of StatCounter Code -->
</div>
</body>
</html>

16
gezegen/opml.xml.tmpl Normal file
View File

@ -0,0 +1,16 @@
<?xml version="1.0"?>
<opml version="1.1">
<head>
<title><TMPL_VAR name></title>
<dateCreated><TMPL_VAR date_822></dateCreated>
<dateModified><TMPL_VAR date_822></dateModified>
<ownerName><TMPL_VAR owner_name></ownerName>
<ownerEmail><TMPL_VAR owner_email></ownerEmail>
</head>
<body>
<TMPL_LOOP Channels>
<outline text="<TMPL_VAR name ESCAPE="HTML">" xmlUrl="<TMPL_VAR uri ESCAPE="HTML">"/>
</TMPL_LOOP>
</body>
</opml>

37
gezegen/rss10.xml.tmpl Normal file
View File

@ -0,0 +1,37 @@
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns="http://purl.org/rss/1.0/"
>
<channel rdf:about="<TMPL_VAR link ESCAPE="HTML">">
<title><TMPL_VAR name></title>
<link><TMPL_VAR link ESCAPE="HTML"></link>
<description><TMPL_VAR name> - <TMPL_VAR link ESCAPE="HTML"></description>
<items>
<rdf:Seq>
<TMPL_LOOP Items>
<rdf:li rdf:resource="<TMPL_VAR id ESCAPE="HTML">" />
</TMPL_LOOP>
</rdf:Seq>
</items>
</channel>
<TMPL_LOOP Items>
<item rdf:about="<TMPL_VAR id ESCAPE="HTML">">
<title><TMPL_VAR channel_name><TMPL_IF title>: <TMPL_VAR title></TMPL_IF></title>
<link><TMPL_VAR link ESCAPE="HTML"></link>
<TMPL_IF content>
<content:encoded><TMPL_VAR content ESCAPE="HTML"></content:encoded>
</TMPL_IF>
<dc:date><TMPL_VAR date_iso></dc:date>
<TMPL_IF creator>
<dc:creator><TMPL_VAR creator></dc:creator>
</TMPL_IF>
</item>
</TMPL_LOOP>
</rdf:RDF>

30
gezegen/rss20.xml.tmpl Normal file
View File

@ -0,0 +1,30 @@
<?xml version="1.0"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
<title><TMPL_VAR name></title>
<link><TMPL_VAR link ESCAPE="HTML"></link>
<language>en</language>
<description><TMPL_VAR name> - <TMPL_VAR link ESCAPE="HTML"></description>
<TMPL_LOOP Items>
<item>
<title><TMPL_VAR channel_name><TMPL_IF title>: <TMPL_VAR title></TMPL_IF></title>
<guid><TMPL_VAR id ESCAPE="HTML"></guid>
<link><TMPL_VAR link ESCAPE="HTML"></link>
<TMPL_IF content>
<description>
<TMPL_IF channel_face>
<![CDATA[<img src="http://gezegen.linux.org.tr/images/heads/<TMPL_VAR channel_face ESCAPE="HTML">" align="right" width="<TMPL_VAR channel_facewidth ESCAPE="HTML">" height="<TMPL_VAR channel_height ESCAPE="HTML">">]]>
</TMPL_IF>
<TMPL_VAR content ESCAPE="HTML"></description>
</TMPL_IF>
<pubDate><TMPL_VAR date_822></pubDate>
<TMPL_IF creator>
<dc:creator><TMPL_VAR creator></dc:creator>
</TMPL_IF>
</item>
</TMPL_LOOP>
</channel>
</rss>

6
gezegen/zaman.sh Normal file
View File

@ -0,0 +1,6 @@
#!/bin/bash
while read x
do
echo "$(date)::$x"
done

194
planet-cache.py Normal file
View File

@ -0,0 +1,194 @@
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
"""Planet cache tool.
"""
__authors__ = [ "Scott James Remnant <scott@netsplit.com>",
"Jeff Waugh <jdub@perkypants.org>" ]
__license__ = "Python"
import os
import sys
import time
import dbhash
import ConfigParser
import planet
def usage():
print "Usage: planet-cache [options] CACHEFILE [ITEMID]..."
print
print "Examine and modify information in the Planet cache."
print
print "Channel Commands:"
print " -C, --channel Display known information on the channel"
print " -L, --list List items in the channel"
print " -K, --keys List all keys found in channel items"
print
print "Item Commands (need ITEMID):"
print " -I, --item Display known information about the item(s)"
print " -H, --hide Mark the item(s) as hidden"
print " -U, --unhide Mark the item(s) as not hidden"
print
print "Other Options:"
print " -h, --help Display this help message and exit"
sys.exit(0)
def usage_error(msg, *args):
print >>sys.stderr, msg, " ".join(args)
print >>sys.stderr, "Perhaps you need --help ?"
sys.exit(1)
def print_keys(item, title):
keys = item.keys()
keys.sort()
key_len = max([ len(k) for k in keys ])
print title + ":"
for key in keys:
if item.key_type(key) == item.DATE:
value = time.strftime(planet.TIMEFMT_ISO, item[key])
else:
value = str(item[key])
print " %-*s %s" % (key_len, key, fit_str(value, 74 - key_len))
def fit_str(string, length):
if len(string) <= length:
return string
else:
return string[:length-4] + " ..."
if __name__ == "__main__":
cache_file = None
want_ids = 0
ids = []
command = None
for arg in sys.argv[1:]:
if arg == "-h" or arg == "--help":
usage()
elif arg == "-C" or arg == "--channel":
if command is not None:
usage_error("Only one command option may be supplied")
command = "channel"
elif arg == "-L" or arg == "--list":
if command is not None:
usage_error("Only one command option may be supplied")
command = "list"
elif arg == "-K" or arg == "--keys":
if command is not None:
usage_error("Only one command option may be supplied")
command = "keys"
elif arg == "-I" or arg == "--item":
if command is not None:
usage_error("Only one command option may be supplied")
command = "item"
want_ids = 1
elif arg == "-H" or arg == "--hide":
if command is not None:
usage_error("Only one command option may be supplied")
command = "hide"
want_ids = 1
elif arg == "-U" or arg == "--unhide":
if command is not None:
usage_error("Only one command option may be supplied")
command = "unhide"
want_ids = 1
elif arg.startswith("-"):
usage_error("Unknown option:", arg)
else:
if cache_file is None:
cache_file = arg
elif want_ids:
ids.append(arg)
else:
usage_error("Unexpected extra argument:", arg)
if cache_file is None:
usage_error("Missing expected cache filename")
elif want_ids and not len(ids):
usage_error("Missing expected entry ids")
# Open the cache file directly to get the URL it represents
try:
db = dbhash.open(cache_file)
url = db["url"]
db.close()
except dbhash.bsddb._db.DBError, e:
print >>sys.stderr, cache_file + ":", e.args[1]
sys.exit(1)
except KeyError:
print >>sys.stderr, cache_file + ": Probably not a cache file"
sys.exit(1)
# Now do it the right way :-)
my_planet = planet.Planet(ConfigParser.ConfigParser())
my_planet.cache_directory = os.path.dirname(cache_file)
channel = planet.Channel(my_planet, url)
for item_id in ids:
if not channel.has_item(item_id):
print >>sys.stderr, item_id + ": Not in channel"
sys.exit(1)
# Do the user's bidding
if command == "channel":
print_keys(channel, "Channel Keys")
elif command == "item":
for item_id in ids:
item = channel.get_item(item_id)
print_keys(item, "Item Keys for %s" % item_id)
elif command == "list":
print "Items in Channel:"
for item in channel.items(hidden=1, sorted=1):
print " " + item.id
print " " + time.strftime(planet.TIMEFMT_ISO, item.date)
if hasattr(item, "title"):
print " " + fit_str(item.title, 70)
if hasattr(item, "hidden"):
print " (hidden)"
elif command == "keys":
keys = {}
for item in channel.items():
for key in item.keys():
keys[key] = 1
keys = keys.keys()
keys.sort()
print "Keys used in Channel:"
for key in keys:
print " " + key
print
print "Use --item to output values of particular items."
elif command == "hide":
for item_id in ids:
item = channel.get_item(item_id)
if hasattr(item, "hidden"):
print item_id + ": Already hidden."
else:
item.hidden = "yes"
channel.cache_write()
print "Done."
elif command == "unhide":
for item_id in ids:
item = channel.get_item(item_id)
if hasattr(item, "hidden"):
del(item.hidden)
else:
print item_id + ": Not hidden."
channel.cache_write()
print "Done."

168
planet.py Normal file
View File

@ -0,0 +1,168 @@
#!/usr/bin/env python
"""The Planet aggregator.
A flexible and easy-to-use aggregator for generating websites.
Visit http://www.planetplanet.org/ for more information and to download
the latest version.
Requires Python 2.1, recommends 2.3.
"""
__authors__ = [ "Scott James Remnant <scott@netsplit.com>",
"Jeff Waugh <jdub@perkypants.org>" ]
__license__ = "Python"
import os
import sys
import time
import locale
import urlparse
import planet
from ConfigParser import ConfigParser
# Default configuration file path
CONFIG_FILE = "config.ini"
# Defaults for the [Planet] config section
PLANET_NAME = "Unconfigured Planet"
PLANET_LINK = "Unconfigured Planet"
PLANET_FEED = None
OWNER_NAME = "Anonymous Coward"
OWNER_EMAIL = ""
LOG_LEVEL = "WARNING"
FEED_TIMEOUT = 20 # seconds
# Default template file list
TEMPLATE_FILES = "examples/basic/planet.html.tmpl"
def config_get(config, section, option, default=None, raw=0, vars=None):
"""Get a value from the configuration, with a default."""
if config.has_option(section, option):
return config.get(section, option, raw=raw, vars=None)
else:
return default
def main():
config_file = CONFIG_FILE
offline = 0
verbose = 0
for arg in sys.argv[1:]:
if arg == "-h" or arg == "--help":
print "Usage: planet [options] [CONFIGFILE]"
print
print "Options:"
print " -v, --verbose DEBUG level logging during update"
print " -o, --offline Update the Planet from the cache only"
print " -h, --help Display this help message and exit"
print
sys.exit(0)
elif arg == "-v" or arg == "--verbose":
verbose = 1
elif arg == "-o" or arg == "--offline":
offline = 1
elif arg.startswith("-"):
print >>sys.stderr, "Unknown option:", arg
sys.exit(1)
else:
config_file = arg
# Read the configuration file
config = ConfigParser()
config.read(config_file)
if not config.has_section("Planet"):
print >>sys.stderr, "Configuration missing [Planet] section."
sys.exit(1)
# Read the [Planet] config section
planet_name = config_get(config, "Planet", "name", PLANET_NAME)
planet_link = config_get(config, "Planet", "link", PLANET_LINK)
planet_feed = config_get(config, "Planet", "feed", PLANET_FEED)
owner_name = config_get(config, "Planet", "owner_name", OWNER_NAME)
owner_email = config_get(config, "Planet", "owner_email", OWNER_EMAIL)
if verbose:
log_level = "DEBUG"
else:
log_level = config_get(config, "Planet", "log_level", LOG_LEVEL)
feed_timeout = config_get(config, "Planet", "feed_timeout", FEED_TIMEOUT)
template_files = config_get(config, "Planet", "template_files",
TEMPLATE_FILES).split(" ")
# Default feed to the first feed for which there is a template
if not planet_feed:
for template_file in template_files:
name = os.path.splitext(os.path.basename(template_file))[0]
if name.find('atom')>=0 or name.find('rss')>=0:
planet_feed = urlparse.urljoin(planet_link, name)
break
# Define locale
if config.has_option("Planet", "locale"):
# The user can specify more than one locale (separated by ":") as
# fallbacks.
locale_ok = False
for user_locale in config.get("Planet", "locale").split(':'):
user_locale = user_locale.strip()
try:
locale.setlocale(locale.LC_ALL, user_locale)
except locale.Error:
pass
else:
locale_ok = True
break
if not locale_ok:
print >>sys.stderr, "Unsupported locale setting."
sys.exit(1)
# Activate logging
planet.logging.basicConfig()
planet.logging.getLogger().setLevel(planet.logging.getLevelName(log_level))
log = planet.logging.getLogger("planet.runner")
try:
log.warning
except:
log.warning = log.warn
# timeoutsocket allows feedparser to time out rather than hang forever on
# ultra-slow servers. Python 2.3 now has this functionality available in
# the standard socket library, so under 2.3 you don't need to install
# anything. But you probably should anyway, because the socket module is
# buggy and timeoutsocket is better.
if feed_timeout:
try:
feed_timeout = float(feed_timeout)
except:
log.warning("Feed timeout set to invalid value '%s', skipping", feed_timeout)
feed_timeout = None
if feed_timeout and not offline:
try:
from planet import timeoutsocket
timeoutsocket.setDefaultSocketTimeout(feed_timeout)
log.debug("Socket timeout set to %d seconds", feed_timeout)
except ImportError:
import socket
if hasattr(socket, 'setdefaulttimeout'):
log.debug("timeoutsocket not found, using python function")
socket.setdefaulttimeout(feed_timeout)
log.debug("Socket timeout set to %d seconds", feed_timeout)
else:
log.error("Unable to set timeout to %d seconds", feed_timeout)
# run the planet
my_planet = planet.Planet(config)
my_planet.run(planet_name, planet_link, template_files, offline)
my_planet.generate_all_files(template_files, planet_name,
planet_link, planet_feed, owner_name, owner_email)
if __name__ == "__main__":
main()

953
planet/__init__.py Normal file
View File

@ -0,0 +1,953 @@
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
"""Planet aggregator library.
This package is a library for developing web sites or software that
aggregate RSS, CDF and Atom feeds taken from elsewhere into a single,
combined feed.
"""
__version__ = "2.0"
__authors__ = [ "Scott James Remnant <scott@netsplit.com>",
"Jeff Waugh <jdub@perkypants.org>" ]
__license__ = "Python"
# Modules available without separate import
import cache
import feedparser
import sanitize
import htmltmpl
import sgmllib
try:
import logging
except:
import compat_logging as logging
# Limit the effect of "from planet import *"
__all__ = ("cache", "feedparser", "htmltmpl", "logging",
"Planet", "Channel", "NewsItem")
import os
import md5
import time
import dbhash
import re
try:
from xml.sax.saxutils import escape
except:
def escape(data):
return data.replace("&","&amp;").replace(">","&gt;").replace("<","&lt;")
# Version information (for generator headers)
VERSION = ("Planet/%s +http://www.planetplanet.org" % __version__)
# Default User-Agent header to send when retreiving feeds
USER_AGENT = VERSION + " " + feedparser.USER_AGENT
# Default cache directory
CACHE_DIRECTORY = "cache"
# Default number of items to display from a new feed
NEW_FEED_ITEMS = 10
# Useful common date/time formats
TIMEFMT_ISO = "%Y-%m-%dT%H:%M:%S+00:00"
TIMEFMT_822 = "%a, %d %b %Y %H:%M:%S +0000"
# Log instance to use here
log = logging.getLogger("planet")
try:
log.warning
except:
log.warning = log.warn
# Defaults for the template file config sections
ENCODING = "utf-8"
ITEMS_PER_PAGE = 60
DAYS_PER_PAGE = 0
OUTPUT_DIR = "output"
DATE_FORMAT = "%B %d, %Y %I:%M %p"
NEW_DATE_FORMAT = "%B %d, %Y"
ACTIVITY_THRESHOLD = 0
class stripHtml(sgmllib.SGMLParser):
"remove all tags from the data"
def __init__(self, data):
sgmllib.SGMLParser.__init__(self)
self.result=''
self.feed(data)
self.close()
def handle_data(self, data):
if data: self.result+=data
def template_info(item, date_format):
"""Produce a dictionary of template information."""
info = {}
for key in item.keys():
if item.key_type(key) == item.DATE:
date = item.get_as_date(key)
info[key] = time.strftime(date_format, date)
info[key + "_iso"] = time.strftime(TIMEFMT_ISO, date)
info[key + "_822"] = time.strftime(TIMEFMT_822, date)
else:
info[key] = item[key]
if 'title' in item.keys():
info['title_plain'] = stripHtml(info['title']).result
return info
class Planet:
"""A set of channels.
This class represents a set of channels for which the items will
be aggregated together into one combined feed.
Properties:
user_agent User-Agent header to fetch feeds with.
cache_directory Directory to store cached channels in.
new_feed_items Number of items to display from a new feed.
filter A regular expression that articles must match.
exclude A regular expression that articles must not match.
"""
def __init__(self, config):
self.config = config
self._channels = []
self.user_agent = USER_AGENT
self.cache_directory = CACHE_DIRECTORY
self.new_feed_items = NEW_FEED_ITEMS
self.filter = None
self.exclude = None
def tmpl_config_get(self, template, option, default=None, raw=0, vars=None):
"""Get a template value from the configuration, with a default."""
if self.config.has_option(template, option):
return self.config.get(template, option, raw=raw, vars=None)
elif self.config.has_option("Planet", option):
return self.config.get("Planet", option, raw=raw, vars=None)
else:
return default
def gather_channel_info(self, template_file="Planet"):
date_format = self.tmpl_config_get(template_file,
"date_format", DATE_FORMAT, raw=1)
activity_threshold = int(self.tmpl_config_get(template_file,
"activity_threshold",
ACTIVITY_THRESHOLD))
if activity_threshold:
activity_horizon = \
time.gmtime(time.time()-86400*activity_threshold)
else:
activity_horizon = 0
channels = {}
channels_list = []
for channel in self.channels(hidden=1):
channels[channel] = template_info(channel, date_format)
channels_list.append(channels[channel])
# identify inactive feeds
if activity_horizon:
latest = channel.items(sorted=1)
if len(latest)==0 or latest[0].date < activity_horizon:
channels[channel]["message"] = \
"no activity in %d days" % activity_threshold
# report channel level errors
if not channel.url_status: continue
status = int(channel.url_status)
if status == 403:
channels[channel]["message"] = "403: forbidden"
elif status == 404:
channels[channel]["message"] = "404: not found"
elif status == 408:
channels[channel]["message"] = "408: request timeout"
elif status == 410:
channels[channel]["message"] = "410: gone"
elif status == 500:
channels[channel]["message"] = "internal server error"
elif status >= 400:
channels[channel]["message"] = "http status %s" % status
return channels, channels_list
def gather_items_info(self, channels, template_file="Planet", channel_list=None):
items_list = []
prev_date = []
prev_channel = None
date_format = self.tmpl_config_get(template_file,
"date_format", DATE_FORMAT, raw=1)
items_per_page = int(self.tmpl_config_get(template_file,
"items_per_page", ITEMS_PER_PAGE))
days_per_page = int(self.tmpl_config_get(template_file,
"days_per_page", DAYS_PER_PAGE))
new_date_format = self.tmpl_config_get(template_file,
"new_date_format", NEW_DATE_FORMAT, raw=1)
for newsitem in self.items(max_items=items_per_page,
max_days=days_per_page,
channels=channel_list):
item_info = template_info(newsitem, date_format)
chan_info = channels[newsitem._channel]
for k, v in chan_info.items():
item_info["channel_" + k] = v
# Check for the start of a new day
if prev_date[:3] != newsitem.date[:3]:
prev_date = newsitem.date
item_info["new_date"] = time.strftime(new_date_format,
newsitem.date)
# Check for the start of a new channel
if item_info.has_key("new_date") \
or prev_channel != newsitem._channel:
prev_channel = newsitem._channel
item_info["new_channel"] = newsitem._channel.url
items_list.append(item_info)
return items_list
def run(self, planet_name, planet_link, template_files, offline = False):
log = logging.getLogger("planet.runner")
# Create a planet
log.info("Loading cached data")
if self.config.has_option("Planet", "cache_directory"):
self.cache_directory = self.config.get("Planet", "cache_directory")
if self.config.has_option("Planet", "new_feed_items"):
self.new_feed_items = int(self.config.get("Planet", "new_feed_items"))
self.user_agent = "%s +%s %s" % (planet_name, planet_link,
self.user_agent)
if self.config.has_option("Planet", "filter"):
self.filter = self.config.get("Planet", "filter")
# The other configuration blocks are channels to subscribe to
for feed_url in self.config.sections():
if feed_url == "Planet" or feed_url in template_files:
continue
log.info(feed_url)
# Create a channel, configure it and subscribe it
channel = Channel(self, feed_url)
self.subscribe(channel)
# Update it
try:
if not offline and not channel.url_status == '410':
channel.update()
except KeyboardInterrupt:
raise
except:
log.exception("Update of <%s> failed", feed_url)
def generate_all_files(self, template_files, planet_name,
planet_link, planet_feed, owner_name, owner_email):
log = logging.getLogger("planet.runner")
# Go-go-gadget-template
for template_file in template_files:
manager = htmltmpl.TemplateManager()
log.info("Processing template %s", template_file)
try:
template = manager.prepare(template_file)
except htmltmpl.TemplateError:
template = manager.prepare(os.path.basename(template_file))
# Read the configuration
output_dir = self.tmpl_config_get(template_file,
"output_dir", OUTPUT_DIR)
date_format = self.tmpl_config_get(template_file,
"date_format", DATE_FORMAT, raw=1)
encoding = self.tmpl_config_get(template_file, "encoding", ENCODING)
# We treat each template individually
base = os.path.splitext(os.path.basename(template_file))[0]
url = os.path.join(planet_link, base)
output_file = os.path.join(output_dir, base)
# Gather information
channels, channels_list = self.gather_channel_info(template_file)
items_list = self.gather_items_info(channels, template_file)
# Gather item information
# Process the template
tp = htmltmpl.TemplateProcessor(html_escape=0)
tp.set("Items", items_list)
tp.set("Channels", channels_list)
# Generic information
tp.set("generator", VERSION)
tp.set("name", planet_name)
tp.set("link", planet_link)
tp.set("owner_name", owner_name)
tp.set("owner_email", owner_email)
tp.set("url", url)
if planet_feed:
tp.set("feed", planet_feed)
tp.set("feedtype", planet_feed.find('rss')>=0 and 'rss' or 'atom')
# Update time
date = time.gmtime()
tp.set("date", time.strftime(date_format, date))
tp.set("date_iso", time.strftime(TIMEFMT_ISO, date))
tp.set("date_822", time.strftime(TIMEFMT_822, date))
try:
log.info("Writing %s", output_file)
output_fd = open(output_file, "w")
if encoding.lower() in ("utf-8", "utf8"):
# UTF-8 output is the default because we use that internally
output_fd.write(tp.process(template))
elif encoding.lower() in ("xml", "html", "sgml"):
# Magic for Python 2.3 users
output = tp.process(template).decode("utf-8")
output_fd.write(output.encode("ascii", "xmlcharrefreplace"))
else:
# Must be a "known" encoding
output = tp.process(template).decode("utf-8")
output_fd.write(output.encode(encoding, "replace"))
output_fd.close()
except KeyboardInterrupt:
raise
except:
log.exception("Write of %s failed", output_file)
def channels(self, hidden=0, sorted=1):
"""Return the list of channels."""
channels = []
for channel in self._channels:
if hidden or not channel.has_key("hidden"):
channels.append((channel.name, channel))
if sorted:
channels.sort()
return [ c[-1] for c in channels ]
def find_by_basename(self, basename):
for channel in self._channels:
if basename == channel.cache_basename(): return channel
def subscribe(self, channel):
"""Subscribe the planet to the channel."""
self._channels.append(channel)
def unsubscribe(self, channel):
"""Unsubscribe the planet from the channel."""
self._channels.remove(channel)
def items(self, hidden=0, sorted=1, max_items=0, max_days=0, channels=None):
"""Return an optionally filtered list of items in the channel.
The filters are applied in the following order:
If hidden is true then items in hidden channels and hidden items
will be returned.
If sorted is true then the item list will be sorted with the newest
first.
If max_items is non-zero then this number of items, at most, will
be returned.
If max_days is non-zero then any items older than the newest by
this number of days won't be returned. Requires sorted=1 to work.
The sharp-eyed will note that this looks a little strange code-wise,
it turns out that Python gets *really* slow if we try to sort the
actual items themselves. Also we use mktime here, but it's ok
because we discard the numbers and just need them to be relatively
consistent between each other.
"""
planet_filter_re = None
if self.filter:
planet_filter_re = re.compile(self.filter, re.I)
planet_exclude_re = None
if self.exclude:
planet_exclude_re = re.compile(self.exclude, re.I)
items = []
seen_guids = {}
if not channels: channels=self.channels(hidden=hidden, sorted=0)
for channel in channels:
for item in channel._items.values():
if hidden or not item.has_key("hidden"):
channel_filter_re = None
if channel.filter:
channel_filter_re = re.compile(channel.filter,
re.I)
channel_exclude_re = None
if channel.exclude:
channel_exclude_re = re.compile(channel.exclude,
re.I)
if (planet_filter_re or planet_exclude_re \
or channel_filter_re or channel_exclude_re):
title = ""
if item.has_key("title"):
title = item.title
content = item.get_content("content")
if planet_filter_re:
if not (planet_filter_re.search(title) \
or planet_filter_re.search(content)):
continue
if planet_exclude_re:
if (planet_exclude_re.search(title) \
or planet_exclude_re.search(content)):
continue
if channel_filter_re:
if not (channel_filter_re.search(title) \
or channel_filter_re.search(content)):
continue
if channel_exclude_re:
if (channel_exclude_re.search(title) \
or channel_exclude_re.search(content)):
continue
if not seen_guids.has_key(item.id):
seen_guids[item.id] = 1;
items.append((time.mktime(item.date), item.order, item))
# Sort the list
if sorted:
items.sort()
items.reverse()
# Apply max_items filter
if len(items) and max_items:
items = items[:max_items]
# Apply max_days filter
if len(items) and max_days:
max_count = 0
max_time = items[0][0] - max_days * 84600
for item in items:
if item[0] > max_time:
max_count += 1
else:
items = items[:max_count]
break
return [ i[-1] for i in items ]
class Channel(cache.CachedInfo):
"""A list of news items.
This class represents a list of news items taken from the feed of
a website or other source.
Properties:
url URL of the feed.
url_etag E-Tag of the feed URL.
url_modified Last modified time of the feed URL.
url_status Last HTTP status of the feed URL.
hidden Channel should be hidden (True if exists).
name Name of the feed owner, or feed title.
next_order Next order number to be assigned to NewsItem
updated Correct UTC-Normalised update time of the feed.
last_updated Correct UTC-Normalised time the feed was last updated.
id An identifier the feed claims is unique (*).
title One-line title (*).
link Link to the original format feed (*).
tagline Short description of the feed (*).
info Longer description of the feed (*).
modified Date the feed claims to have been modified (*).
author Name of the author (*).
publisher Name of the publisher (*).
generator Name of the feed generator (*).
category Category name (*).
copyright Copyright information for humans to read (*).
license Link to the licence for the content (*).
docs Link to the specification of the feed format (*).
language Primary language (*).
errorreportsto E-Mail address to send error reports to (*).
image_url URL of an associated image (*).
image_link Link to go with the associated image (*).
image_title Alternative text of the associated image (*).
image_width Width of the associated image (*).
image_height Height of the associated image (*).
filter A regular expression that articles must match.
exclude A regular expression that articles must not match.
Properties marked (*) will only be present if the original feed
contained them. Note that the optional 'modified' date field is simply
a claim made by the item and parsed from the information given, 'updated'
(and 'last_updated') are far more reliable sources of information.
Some feeds may define additional properties to those above.
"""
IGNORE_KEYS = ("links", "contributors", "textinput", "cloud", "categories",
"url", "href", "url_etag", "url_modified", "tags", "itunes_explicit")
def __init__(self, planet, url):
if not os.path.isdir(planet.cache_directory):
os.makedirs(planet.cache_directory)
cache_filename = cache.filename(planet.cache_directory, url)
cache_file = dbhash.open(cache_filename, "c", 0666)
cache.CachedInfo.__init__(self, cache_file, url, root=1)
self._items = {}
self._planet = planet
self._expired = []
self.url = url
# retain the original URL for error reporting
self.configured_url = url
self.url_etag = None
self.url_status = None
self.url_modified = None
self.name = None
self.updated = None
self.last_updated = None
self.filter = None
self.exclude = None
self.next_order = "0"
self.cache_read()
self.cache_read_entries()
if planet.config.has_section(url):
for option in planet.config.options(url):
value = planet.config.get(url, option)
self.set_as_string(option, value, cached=0)
def has_item(self, id_):
"""Check whether the item exists in the channel."""
return self._items.has_key(id_)
def get_item(self, id_):
"""Return the item from the channel."""
return self._items[id_]
# Special methods
__contains__ = has_item
def items(self, hidden=0, sorted=0):
"""Return the item list."""
items = []
for item in self._items.values():
if hidden or not item.has_key("hidden"):
items.append((time.mktime(item.date), item.order, item))
if sorted:
items.sort()
items.reverse()
return [ i[-1] for i in items ]
def __iter__(self):
"""Iterate the sorted item list."""
return iter(self.items(sorted=1))
def cache_read_entries(self):
"""Read entry information from the cache."""
keys = self._cache.keys()
for key in keys:
if key.find(" ") != -1: continue
if self.has_key(key): continue
item = NewsItem(self, key)
self._items[key] = item
def cache_basename(self):
return cache.filename('',self._id)
def cache_write(self, sync=1):
"""Write channel and item information to the cache."""
for item in self._items.values():
item.cache_write(sync=0)
for item in self._expired:
item.cache_clear(sync=0)
cache.CachedInfo.cache_write(self, sync)
self._expired = []
def feed_information(self):
"""
Returns a description string for the feed embedded in this channel.
This will usually simply be the feed url embedded in <>, but in the
case where the current self.url has changed from the original
self.configured_url the string will contain both pieces of information.
This is so that the URL in question is easier to find in logging
output: getting an error about a URL that doesn't appear in your config
file is annoying.
"""
if self.url == self.configured_url:
return "<%s>" % self.url
else:
return "<%s> (formerly <%s>)" % (self.url, self.configured_url)
def update(self):
"""Download the feed to refresh the information.
This does the actual work of pulling down the feed and if it changes
updates the cached information about the feed and entries within it.
"""
info = feedparser.parse(self.url,
etag=self.url_etag, modified=self.url_modified,
agent=self._planet.user_agent)
if info.has_key("status"):
self.url_status = str(info.status)
elif info.has_key("entries") and len(info.entries)>0:
self.url_status = str(200)
elif info.bozo and info.bozo_exception.__class__.__name__=='Timeout':
self.url_status = str(408)
else:
self.url_status = str(500)
if self.url_status == '301' and \
(info.has_key("entries") and len(info.entries)>0):
log.warning("Feed has moved from <%s> to <%s>", self.url, info.url)
try:
os.link(cache.filename(self._planet.cache_directory, self.url),
cache.filename(self._planet.cache_directory, info.url))
except:
pass
self.url = info.url
elif self.url_status == '304':
log.info("Feed %s unchanged", self.feed_information())
return
elif self.url_status == '410':
log.info("Feed %s gone", self.feed_information())
self.cache_write()
return
elif self.url_status == '408':
log.warning("Feed %s timed out", self.feed_information())
return
elif int(self.url_status) >= 400:
log.error("Error %s while updating feed %s",
self.url_status, self.feed_information())
return
else:
log.info("Updating feed %s", self.feed_information())
self.url_etag = info.has_key("etag") and info.etag or None
self.url_modified = info.has_key("modified") and info.modified or None
if self.url_etag is not None:
log.debug("E-Tag: %s", self.url_etag)
if self.url_modified is not None:
log.debug("Last Modified: %s",
time.strftime(TIMEFMT_ISO, self.url_modified))
self.update_info(info.feed)
self.update_entries(info.entries)
self.cache_write()
def update_info(self, feed):
"""Update information from the feed.
This reads the feed information supplied by feedparser and updates
the cached information about the feed. These are the various
potentially interesting properties that you might care about.
"""
for key in feed.keys():
if key in self.IGNORE_KEYS or key + "_parsed" in self.IGNORE_KEYS:
# Ignored fields
pass
elif feed.has_key(key + "_parsed"):
# Ignore unparsed date fields
pass
elif key.endswith("_detail"):
# retain name and email sub-fields
if feed[key].has_key('name') and feed[key].name:
self.set_as_string(key.replace("_detail","_name"), \
feed[key].name)
if feed[key].has_key('email') and feed[key].email:
self.set_as_string(key.replace("_detail","_email"), \
feed[key].email)
elif key == "items":
# Ignore items field
pass
elif key.endswith("_parsed"):
# Date fields
if feed[key] is not None:
self.set_as_date(key[:-len("_parsed")], feed[key])
elif key == "image":
# Image field: save all the information
if feed[key].has_key("url"):
self.set_as_string(key + "_url", feed[key].url)
if feed[key].has_key("link"):
self.set_as_string(key + "_link", feed[key].link)
if feed[key].has_key("title"):
self.set_as_string(key + "_title", feed[key].title)
if feed[key].has_key("width"):
self.set_as_string(key + "_width", str(feed[key].width))
if feed[key].has_key("height"):
self.set_as_string(key + "_height", str(feed[key].height))
elif isinstance(feed[key], (str, unicode)):
# String fields
try:
detail = key + '_detail'
if feed.has_key(detail) and feed[detail].has_key('type'):
if feed[detail].type == 'text/html':
feed[key] = sanitize.HTML(feed[key])
elif feed[detail].type == 'text/plain':
feed[key] = escape(feed[key])
self.set_as_string(key, feed[key])
except KeyboardInterrupt:
raise
except:
log.exception("Ignored '%s' of <%s>, unknown format",
key, self.url)
def update_entries(self, entries):
"""Update entries from the feed.
This reads the entries supplied by feedparser and updates the
cached information about them. It's at this point we update
the 'updated' timestamp and keep the old one in 'last_updated',
these provide boundaries for acceptable entry times.
If this is the first time a feed has been updated then most of the
items will be marked as hidden, according to Planet.new_feed_items.
If the feed does not contain items which, according to the sort order,
should be there; those items are assumed to have been expired from
the feed or replaced and are removed from the cache.
"""
if not len(entries):
return
self.last_updated = self.updated
self.updated = time.gmtime()
new_items = []
feed_items = []
for entry in entries:
# Try really hard to find some kind of unique identifier
if entry.has_key("id"):
entry_id = cache.utf8(entry.id)
elif entry.has_key("link"):
entry_id = cache.utf8(entry.link)
elif entry.has_key("title"):
entry_id = (self.url + "/"
+ md5.new(cache.utf8(entry.title)).hexdigest())
elif entry.has_key("summary"):
entry_id = (self.url + "/"
+ md5.new(cache.utf8(entry.summary)).hexdigest())
else:
log.error("Unable to find or generate id, entry ignored")
continue
# Create the item if necessary and update
if self.has_item(entry_id):
item = self._items[entry_id]
else:
item = NewsItem(self, entry_id)
self._items[entry_id] = item
new_items.append(item)
item.update(entry)
feed_items.append(entry_id)
# Hide excess items the first time through
if self.last_updated is None and self._planet.new_feed_items \
and len(feed_items) > self._planet.new_feed_items:
item.hidden = "yes"
log.debug("Marked <%s> as hidden (new feed)", entry_id)
# Assign order numbers in reverse
new_items.reverse()
for item in new_items:
item.order = self.next_order = str(int(self.next_order) + 1)
# Check for expired or replaced items
feed_count = len(feed_items)
log.debug("Items in Feed: %d", feed_count)
for item in self.items(sorted=1):
if feed_count < 1:
break
elif item.id in feed_items:
feed_count -= 1
elif item._channel.url_status != '226':
del(self._items[item.id])
self._expired.append(item)
log.debug("Removed expired or replaced item <%s>", item.id)
def get_name(self, key):
"""Return the key containing the name."""
for key in ("name", "title"):
if self.has_key(key) and self.key_type(key) != self.NULL:
return self.get_as_string(key)
return ""
class NewsItem(cache.CachedInfo):
"""An item of news.
This class represents a single item of news on a channel. They're
created by members of the Channel class and accessible through it.
Properties:
id Channel-unique identifier for this item.
id_hash Relatively short, printable cryptographic hash of id
date Corrected UTC-Normalised update time, for sorting.
order Order in which items on the same date can be sorted.
hidden Item should be hidden (True if exists).
title One-line title (*).
link Link to the original format text (*).
summary Short first-page summary (*).
content Full HTML content.
modified Date the item claims to have been modified (*).
issued Date the item claims to have been issued (*).
created Date the item claims to have been created (*).
expired Date the item claims to expire (*).
author Name of the author (*).
publisher Name of the publisher (*).
category Category name (*).
comments Link to a page to enter comments (*).
license Link to the licence for the content (*).
source_name Name of the original source of this item (*).
source_link Link to the original source of this item (*).
Properties marked (*) will only be present if the original feed
contained them. Note that the various optional date fields are
simply claims made by the item and parsed from the information
given, 'date' is a far more reliable source of information.
Some feeds may define additional properties to those above.
"""
IGNORE_KEYS = ("categories", "contributors", "enclosures", "links",
"guidislink", "date", "tags")
def __init__(self, channel, id_):
cache.CachedInfo.__init__(self, channel._cache, id_)
self._channel = channel
self.id = id_
self.id_hash = md5.new(id_).hexdigest()
self.date = None
self.order = None
self.content = None
self.cache_read()
def update(self, entry):
"""Update the item from the feedparser entry given."""
for key in entry.keys():
if key in self.IGNORE_KEYS or key + "_parsed" in self.IGNORE_KEYS:
# Ignored fields
pass
elif entry.has_key(key + "_parsed"):
# Ignore unparsed date fields
pass
elif key.endswith("_detail"):
# retain name, email, and language sub-fields
if entry[key].has_key('name') and entry[key].name:
self.set_as_string(key.replace("_detail","_name"), \
entry[key].name)
if entry[key].has_key('email') and entry[key].email:
self.set_as_string(key.replace("_detail","_email"), \
entry[key].email)
if entry[key].has_key('language') and entry[key].language and \
(not self._channel.has_key('language') or \
entry[key].language != self._channel.language):
self.set_as_string(key.replace("_detail","_language"), \
entry[key].language)
elif key.endswith("_parsed"):
# Date fields
if entry[key] is not None:
self.set_as_date(key[:-len("_parsed")], entry[key])
elif key == "source":
# Source field: save both url and value
if entry[key].has_key("value"):
self.set_as_string(key + "_name", entry[key].value)
if entry[key].has_key("url"):
self.set_as_string(key + "_link", entry[key].url)
elif key == "content":
# Content field: concatenate the values
value = ""
for item in entry[key]:
if item.type == 'text/html':
item.value = sanitize.HTML(item.value)
elif item.type == 'text/plain':
item.value = escape(item.value)
if item.has_key('language') and item.language and \
(not self._channel.has_key('language') or
item.language != self._channel.language) :
self.set_as_string(key + "_language", item.language)
value += cache.utf8(item.value)
self.set_as_string(key, value)
elif isinstance(entry[key], (str, unicode)):
# String fields
try:
detail = key + '_detail'
if entry.has_key(detail):
if entry[detail].has_key('type'):
if entry[detail].type == 'text/html':
entry[key] = sanitize.HTML(entry[key])
elif entry[detail].type == 'text/plain':
entry[key] = escape(entry[key])
self.set_as_string(key, entry[key])
except KeyboardInterrupt:
raise
except:
log.exception("Ignored '%s' of <%s>, unknown format",
key, self.id)
# Generate the date field if we need to
self.get_date("date")
def get_date(self, key):
"""Get (or update) the date key.
We check whether the date the entry claims to have been changed is
since we last updated this feed and when we pulled the feed off the
site.
If it is then it's probably not bogus, and we'll sort accordingly.
If it isn't then we bound it appropriately, this ensures that
entries appear in posting sequence but don't overlap entries
added in previous updates and don't creep into the next one.
"""
for other_key in ("updated", "modified", "published", "issued", "created"):
if self.has_key(other_key):
date = self.get_as_date(other_key)
break
else:
date = None
if date is not None:
if date > self._channel.updated:
date = self._channel.updated
# elif date < self._channel.last_updated:
# date = self._channel.updated
elif self.has_key(key) and self.key_type(key) != self.NULL:
return self.get_as_date(key)
else:
date = self._channel.updated
self.set_as_date(key, date)
return date
def get_content(self, key):
"""Return the key containing the content."""
for key in ("content", "tagline", "summary"):
if self.has_key(key) and self.key_type(key) != self.NULL:
return self.get_as_string(key)
return ""

948
planet/__init__.py.backup Normal file
View File

@ -0,0 +1,948 @@
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
"""Planet aggregator library.
This package is a library for developing web sites or software that
aggregate RSS, CDF and Atom feeds taken from elsewhere into a single,
combined feed.
"""
__version__ = "1.0"
__authors__ = [ "Scott James Remnant <scott@netsplit.com>",
"Jeff Waugh <jdub@perkypants.org>" ]
__license__ = "Python"
# Modules available without separate import
import cache
import feedparser
import sanitize
import htmltmpl
import sgmllib
try:
import logging
except:
import compat_logging as logging
# Limit the effect of "from planet import *"
__all__ = ("cache", "feedparser", "htmltmpl", "logging",
"Planet", "Channel", "NewsItem")
import locale
import os
import md5
import time
import dbhash
import re
import xml.sax.saxutils
# Version information (for generator headers)
VERSION = ("Planet/%s +http://www.planetplanet.org" % __version__)
# Default User-Agent header to send when retreiving feeds
USER_AGENT = VERSION + " " + feedparser.USER_AGENT
# Default cache directory
CACHE_DIRECTORY = "cache"
# Default number of items to display from a new feed
NEW_FEED_ITEMS = 10
# Useful common date/time formats
TIMEFMT_ISO = "%Y-%m-%dT%H:%M:%S+00:00"
TIMEFMT_822 = "%a, %d %b %Y %H:%M:%S +0000"
# Log instance to use here
log = logging.getLogger("planet")
try:
log.warning
except:
log.warning = log.warn
# Defaults for the template file config sections
ENCODING = "utf-8"
ITEMS_PER_PAGE = 60
DAYS_PER_PAGE = 0
OUTPUT_DIR = "output"
DATE_FORMAT = "%B %d, %Y %I:%M %p"
NEW_DATE_FORMAT = "%B %d, %Y"
ACTIVITY_THRESHOLD = 0
class stripHtml(sgmllib.SGMLParser):
"remove all tags from the data"
def __init__(self, data):
sgmllib.SGMLParser.__init__(self)
self.result=''
self.feed(data)
self.close()
def handle_data(self, data):
if data: self.result+=data
def template_info(item, date_format):
"""Produce a dictionary of template information."""
info = {}
for key in item.keys():
if item.key_type(key) == item.DATE:
date = item.get_as_date(key)
info[key] = time.strftime(date_format, date)
info[key + "_iso"] = time.strftime(TIMEFMT_ISO, date)
info[key + "_822"] = time.strftime(TIMEFMT_822, date)
else:
info[key] = item[key]
if 'title' in item.keys():
info['title_plain'] = stripHtml(info['title']).result
return info
class Planet:
"""A set of channels.
This class represents a set of channels for which the items will
be aggregated together into one combined feed.
Properties:
user_agent User-Agent header to fetch feeds with.
cache_directory Directory to store cached channels in.
new_feed_items Number of items to display from a new feed.
filter A regular expression that articles must match.
exclude A regular expression that articles must not match.
"""
def __init__(self, config):
self.config = config
self._channels = []
self.user_agent = USER_AGENT
self.cache_directory = CACHE_DIRECTORY
self.new_feed_items = NEW_FEED_ITEMS
self.filter = None
self.exclude = None
def tmpl_config_get(self, template, option, default=None, raw=0, vars=None):
"""Get a template value from the configuration, with a default."""
if self.config.has_option(template, option):
return self.config.get(template, option, raw=raw, vars=None)
elif self.config.has_option("Planet", option):
return self.config.get("Planet", option, raw=raw, vars=None)
else:
return default
def gather_channel_info(self, template_file="Planet"):
date_format = self.tmpl_config_get(template_file,
"date_format", DATE_FORMAT, raw=1)
activity_threshold = int(self.tmpl_config_get(template_file,
"activity_threshold",
ACTIVITY_THRESHOLD))
if activity_threshold:
activity_horizon = \
time.gmtime(time.time()-86400*activity_threshold)
else:
activity_horizon = 0
channels = {}
channels_list = []
for channel in self.channels(hidden=1):
channels[channel] = template_info(channel, date_format)
channels_list.append(channels[channel])
# identify inactive feeds
if activity_horizon:
latest = channel.items(sorted=1)
if len(latest)==0 or latest[0].date < activity_horizon:
channels[channel]["message"] = \
"no activity in %d days" % activity_threshold
# report channel level errors
if not channel.url_status: continue
status = int(channel.url_status)
if status == 403:
channels[channel]["message"] = "403: forbidden"
elif status == 404:
channels[channel]["message"] = "404: not found"
elif status == 408:
channels[channel]["message"] = "408: request timeout"
elif status == 410:
channels[channel]["message"] = "410: gone"
elif status == 500:
channels[channel]["message"] = "internal server error"
elif status >= 400:
channels[channel]["message"] = "http status %s" % status
return channels, channels_list
def gather_items_info(self, channels, template_file="Planet", channel_list=None):
items_list = []
prev_date = []
prev_channel = None
date_format = self.tmpl_config_get(template_file,
"date_format", DATE_FORMAT, raw=1)
items_per_page = int(self.tmpl_config_get(template_file,
"items_per_page", ITEMS_PER_PAGE))
days_per_page = int(self.tmpl_config_get(template_file,
"days_per_page", DAYS_PER_PAGE))
new_date_format = self.tmpl_config_get(template_file,
"new_date_format", NEW_DATE_FORMAT, raw=1)
for newsitem in self.items(max_items=items_per_page,
max_days=days_per_page,
channels=channel_list):
newsitem.date = time.localtime(time.mktime(newsitem.date)+7200)
item_info = template_info(newsitem, date_format)
chan_info = channels[newsitem._channel]
for k, v in chan_info.items():
item_info["channel_" + k] = v
# Check for the start of a new day
if prev_date[:3] != newsitem.date[:3]:
prev_date = newsitem.date
item_info["new_date"] = time.strftime(new_date_format,
newsitem.date)
# Check for the start of a new channel
if item_info.has_key("new_date") \
or prev_channel != newsitem._channel:
prev_channel = newsitem._channel
item_info["new_channel"] = newsitem._channel.url
items_list.append(item_info)
return items_list
def run(self, planet_name, planet_link, template_files, offline = False):
log = logging.getLogger("planet.runner")
# Create a planet
log.info("Loading cached data")
if self.config.has_option("Planet", "cache_directory"):
self.cache_directory = self.config.get("Planet", "cache_directory")
if self.config.has_option("Planet", "new_feed_items"):
self.new_feed_items = int(self.config.get("Planet", "new_feed_items"))
self.user_agent = "%s +%s %s" % (planet_name, planet_link,
self.user_agent)
if self.config.has_option("Planet", "filter"):
self.filter = self.config.get("Planet", "filter")
# The other configuration blocks are channels to subscribe to
for feed_url in self.config.sections():
if feed_url == "Planet" or feed_url in template_files:
continue
# Create a channel, configure it and subscribe it
channel = Channel(self, feed_url)
self.subscribe(channel)
# Update it
try:
if not offline and not channel.url_status == '410':
channel.update()
except KeyboardInterrupt:
raise
except:
log.exception("Update of <%s> failed", feed_url)
def generate_all_files(self, template_files, planet_name,
planet_link, planet_feed, owner_name, owner_email):
log = logging.getLogger("planet.runner")
# Go-go-gadget-template
for template_file in template_files:
manager = htmltmpl.TemplateManager()
log.info("Processing template %s", template_file)
template = manager.prepare(template_file)
# Read the configuration
output_dir = self.tmpl_config_get(template_file,
"output_dir", OUTPUT_DIR)
date_format = self.tmpl_config_get(template_file,
"date_format", DATE_FORMAT, raw=1)
encoding = self.tmpl_config_get(template_file, "encoding", ENCODING)
# We treat each template individually
base = os.path.splitext(os.path.basename(template_file))[0]
url = os.path.join(planet_link, base)
output_file = os.path.join(output_dir, base)
# Gather information
channels, channels_list = self.gather_channel_info(template_file)
items_list = self.gather_items_info(channels, template_file)
# Gather item information
# Process the template
tp = htmltmpl.TemplateProcessor(html_escape=0)
tp.set("Items", items_list)
tp.set("Channels", channels_list)
# Generic information
tp.set("generator", VERSION)
tp.set("name", planet_name)
tp.set("link", planet_link)
tp.set("owner_name", owner_name)
tp.set("owner_email", owner_email)
tp.set("url", url)
if planet_feed:
tp.set("feed", planet_feed)
tp.set("feedtype", planet_feed.find('rss')>=0 and 'rss' or 'atom')
# Update time
date = time.localtime()
tp.set("date", time.strftime(date_format, date))
tp.set("date_iso", time.strftime(TIMEFMT_ISO, date))
tp.set("date_822", time.strftime(TIMEFMT_822, date))
try:
log.info("Writing %s", output_file)
output_fd = open(output_file, "w")
if encoding.lower() in ("utf-8", "utf8"):
# UTF-8 output is the default because we use that internally
output_fd.write(tp.process(template))
elif encoding.lower() in ("xml", "html", "sgml"):
# Magic for Python 2.3 users
output = tp.process(template).decode("utf-8")
output_fd.write(output.encode("ascii", "xmlcharrefreplace"))
else:
# Must be a "known" encoding
output = tp.process(template).decode("utf-8")
output_fd.write(output.encode(encoding, "replace"))
output_fd.close()
except KeyboardInterrupt:
raise
except:
log.exception("Write of %s failed", output_file)
def channels(self, hidden=0, sorted=1):
"""Return the list of channels."""
channels = []
for channel in self._channels:
if hidden or not channel.has_key("hidden"):
channels.append((channel.name, channel))
if sorted:
locale.setlocale(locale.LC_ALL,"tr_TR.UTF-8")
channels.sort(key=lambda x: locale.strxfrm(x[0]))
locale.setlocale(locale.LC_ALL,"C")
return [ c[-1] for c in channels ]
def find_by_basename(self, basename):
for channel in self._channels:
if basename == channel.cache_basename(): return channel
def subscribe(self, channel):
"""Subscribe the planet to the channel."""
self._channels.append(channel)
def unsubscribe(self, channel):
"""Unsubscribe the planet from the channel."""
self._channels.remove(channel)
def items(self, hidden=0, sorted=1, max_items=0, max_days=0, channels=None):
"""Return an optionally filtered list of items in the channel.
The filters are applied in the following order:
If hidden is true then items in hidden channels and hidden items
will be returned.
If sorted is true then the item list will be sorted with the newest
first.
If max_items is non-zero then this number of items, at most, will
be returned.
If max_days is non-zero then any items older than the newest by
this number of days won't be returned. Requires sorted=1 to work.
The sharp-eyed will note that this looks a little strange code-wise,
it turns out that Python gets *really* slow if we try to sort the
actual items themselves. Also we use mktime here, but it's ok
because we discard the numbers and just need them to be relatively
consistent between each other.
"""
planet_filter_re = None
if self.filter:
planet_filter_re = re.compile(self.filter, re.I)
planet_exclude_re = None
if self.exclude:
planet_exclude_re = re.compile(self.exclude, re.I)
items = []
seen_guids = {}
if not channels: channels=self.channels(hidden=hidden, sorted=0)
for channel in channels:
for item in channel._items.values():
if hidden or not item.has_key("hidden"):
channel_filter_re = None
if channel.filter:
channel_filter_re = re.compile(channel.filter,
re.I)
channel_exclude_re = None
if channel.exclude:
channel_exclude_re = re.compile(channel.exclude,
re.I)
if (planet_filter_re or planet_exclude_re \
or channel_filter_re or channel_exclude_re):
title = ""
if item.has_key("title"):
title = item.title
content = item.get_content("content")
if planet_filter_re:
if not (planet_filter_re.search(title) \
or planet_filter_re.search(content)):
continue
if planet_exclude_re:
if (planet_exclude_re.search(title) \
or planet_exclude_re.search(content)):
continue
if channel_filter_re:
if not (channel_filter_re.search(title) \
or channel_filter_re.search(content)):
continue
if channel_exclude_re:
if (channel_exclude_re.search(title) \
or channel_exclude_re.search(content)):
continue
if not seen_guids.has_key(item.id):
seen_guids[item.id] = 1;
items.append((time.mktime(item.date), item.order, item))
# Sort the list
if sorted:
items.sort()
items.reverse()
# Apply max_items filter
if len(items) and max_items:
items = items[:max_items]
# Apply max_days filter
if len(items) and max_days:
max_count = 0
max_time = items[0][0] - max_days * 84600
for item in items:
if item[0] > max_time:
max_count += 1
else:
items = items[:max_count]
break
return [ i[-1] for i in items ]
class Channel(cache.CachedInfo):
"""A list of news items.
This class represents a list of news items taken from the feed of
a website or other source.
Properties:
url URL of the feed.
url_etag E-Tag of the feed URL.
url_modified Last modified time of the feed URL.
url_status Last HTTP status of the feed URL.
hidden Channel should be hidden (True if exists).
name Name of the feed owner, or feed title.
next_order Next order number to be assigned to NewsItem
updated Correct UTC-Normalised update time of the feed.
last_updated Correct UTC-Normalised time the feed was last updated.
id An identifier the feed claims is unique (*).
title One-line title (*).
link Link to the original format feed (*).
tagline Short description of the feed (*).
info Longer description of the feed (*).
modified Date the feed claims to have been modified (*).
author Name of the author (*).
publisher Name of the publisher (*).
generator Name of the feed generator (*).
category Category name (*).
copyright Copyright information for humans to read (*).
license Link to the licence for the content (*).
docs Link to the specification of the feed format (*).
language Primary language (*).
errorreportsto E-Mail address to send error reports to (*).
image_url URL of an associated image (*).
image_link Link to go with the associated image (*).
image_title Alternative text of the associated image (*).
image_width Width of the associated image (*).
image_height Height of the associated image (*).
filter A regular expression that articles must match.
exclude A regular expression that articles must not match.
Properties marked (*) will only be present if the original feed
contained them. Note that the optional 'modified' date field is simply
a claim made by the item and parsed from the information given, 'updated'
(and 'last_updated') are far more reliable sources of information.
Some feeds may define additional properties to those above.
"""
IGNORE_KEYS = ("links", "contributors", "textinput", "cloud", "categories",
"url", "href", "url_etag", "url_modified", "tags", "itunes_explicit")
def __init__(self, planet, url):
if not os.path.isdir(planet.cache_directory):
os.makedirs(planet.cache_directory)
cache_filename = cache.filename(planet.cache_directory, url)
cache_file = dbhash.open(cache_filename, "c", 0666)
cache.CachedInfo.__init__(self, cache_file, url, root=1)
self._items = {}
self._planet = planet
self._expired = []
self.url = url
# retain the original URL for error reporting
self.configured_url = url
self.url_etag = None
self.url_status = None
self.url_modified = None
self.name = None
self.updated = None
self.last_updated = None
self.filter = None
self.exclude = None
self.next_order = "0"
self.cache_read()
self.cache_read_entries()
if planet.config.has_section(url):
for option in planet.config.options(url):
value = planet.config.get(url, option)
self.set_as_string(option, value, cached=0)
def has_item(self, id_):
"""Check whether the item exists in the channel."""
return self._items.has_key(id_)
def get_item(self, id_):
"""Return the item from the channel."""
return self._items[id_]
# Special methods
__contains__ = has_item
def items(self, hidden=0, sorted=0):
"""Return the item list."""
items = []
for item in self._items.values():
if hidden or not item.has_key("hidden"):
items.append((time.mktime(item.date), item.order, item))
if sorted:
items.sort()
items.reverse()
return [ i[-1] for i in items ]
def __iter__(self):
"""Iterate the sorted item list."""
return iter(self.items(sorted=1))
def cache_read_entries(self):
"""Read entry information from the cache."""
keys = self._cache.keys()
for key in keys:
if key.find(" ") != -1: continue
if self.has_key(key): continue
item = NewsItem(self, key)
self._items[key] = item
def cache_basename(self):
return cache.filename('',self._id)
def cache_write(self, sync=1):
"""Write channel and item information to the cache."""
for item in self._items.values():
item.cache_write(sync=0)
for item in self._expired:
item.cache_clear(sync=0)
cache.CachedInfo.cache_write(self, sync)
self._expired = []
def feed_information(self):
"""
Returns a description string for the feed embedded in this channel.
This will usually simply be the feed url embedded in <>, but in the
case where the current self.url has changed from the original
self.configured_url the string will contain both pieces of information.
This is so that the URL in question is easier to find in logging
output: getting an error about a URL that doesn't appear in your config
file is annoying.
"""
if self.url == self.configured_url:
return "<%s>" % self.url
else:
return "<%s> (formerly <%s>)" % (self.url, self.configured_url)
def update(self):
"""Download the feed to refresh the information.
This does the actual work of pulling down the feed and if it changes
updates the cached information about the feed and entries within it.
"""
info = feedparser.parse(self.url,
etag=self.url_etag, modified=self.url_modified,
agent=self._planet.user_agent)
if info.has_key("status"):
self.url_status = str(info.status)
elif info.has_key("entries") and len(info.entries)>0:
self.url_status = str(200)
elif info.bozo and info.bozo_exception.__class__.__name__=='Timeout':
self.url_status = str(408)
else:
self.url_status = str(500)
if self.url_status == '301' and (info.has_key("entries") and len(info.entries)>0):
if self.url != info.url:
log.warning("Feed has moved from <%s> to <%s>", self.url, info.url)
os.link(cache.filename(self._planet.cache_directory, self.url),
cache.filename(self._planet.cache_directory, info.url))
            self.url != info.url
elif self.url_status == '304':
log.info("Feed %s unchanged", self.feed_information())
return
elif self.url_status == '410':
log.info("Feed %s gone", self.feed_information())
self.cache_write()
return
elif self.url_status == '408':
log.warning("Feed %s timed out", self.feed_information())
return
elif int(self.url_status) >= 400:
log.error("Error %s while updating feed %s",
self.url_status, self.feed_information())
return
else:
log.info("Updating feed %s", self.feed_information())
self.url_etag = info.has_key("etag") and info.etag or None
self.url_modified = info.has_key("modified") and info.modified or None
if self.url_etag is not None:
log.debug("E-Tag: %s", self.url_etag)
if self.url_modified is not None:
log.debug("Last Modified: %s",
time.strftime(TIMEFMT_ISO, self.url_modified))
self.update_info(info.feed)
self.update_entries(info.entries)
self.cache_write()
def update_info(self, feed):
"""Update information from the feed.
This reads the feed information supplied by feedparser and updates
the cached information about the feed. These are the various
potentially interesting properties that you might care about.
"""
for key in feed.keys():
if key in self.IGNORE_KEYS or key + "_parsed" in self.IGNORE_KEYS:
# Ignored fields
pass
elif feed.has_key(key + "_parsed"):
# Ignore unparsed date fields
pass
elif key.endswith("_detail"):
# retain name and email sub-fields
if feed[key].has_key('name') and feed[key].name:
self.set_as_string(key.replace("_detail","_name"), \
feed[key].name)
if feed[key].has_key('email') and feed[key].email:
self.set_as_string(key.replace("_detail","_email"), \
feed[key].email)
elif key == "items":
# Ignore items field
pass
elif key.endswith("_parsed"):
# Date fields
if feed[key] is not None:
self.set_as_date(key[:-len("_parsed")], feed[key])
elif key == "image":
# Image field: save all the information
if feed[key].has_key("url"):
self.set_as_string(key + "_url", feed[key].url)
if feed[key].has_key("link"):
self.set_as_string(key + "_link", feed[key].link)
if feed[key].has_key("title"):
self.set_as_string(key + "_title", feed[key].title)
if feed[key].has_key("width"):
self.set_as_string(key + "_width", str(feed[key].width))
if feed[key].has_key("height"):
self.set_as_string(key + "_height", str(feed[key].height))
elif isinstance(feed[key], (str, unicode)):
# String fields
try:
detail = key + '_detail'
if feed.has_key(detail) and feed[detail].has_key('type'):
if feed[detail].type == 'text/html':
feed[key] = sanitize.HTML(feed[key])
elif feed[detail].type == 'text/plain':
feed[key] = xml.sax.saxutils.escape(feed[key])
self.set_as_string(key, feed[key])
except KeyboardInterrupt:
raise
except:
log.exception("Ignored '%s' of <%s>, unknown format",
key, self.url)
def update_entries(self, entries):
"""Update entries from the feed.
This reads the entries supplied by feedparser and updates the
cached information about them. It's at this point we update
the 'updated' timestamp and keep the old one in 'last_updated',
these provide boundaries for acceptable entry times.
If this is the first time a feed has been updated then most of the
items will be marked as hidden, according to Planet.new_feed_items.
If the feed does not contain items which, according to the sort order,
should be there; those items are assumed to have been expired from
the feed or replaced and are removed from the cache.
"""
if not len(entries):
return
self.last_updated = self.updated
self.updated = time.gmtime()
new_items = []
feed_items = []
for entry in entries:
# Try really hard to find some kind of unique identifier
if entry.has_key("id"):
entry_id = cache.utf8(entry.id)
elif entry.has_key("link"):
entry_id = cache.utf8(entry.link)
elif entry.has_key("title"):
entry_id = (self.url + "/"
+ md5.new(cache.utf8(entry.title)).hexdigest())
elif entry.has_key("summary"):
entry_id = (self.url + "/"
+ md5.new(cache.utf8(entry.summary)).hexdigest())
else:
log.error("Unable to find or generate id, entry ignored")
continue
# Create the item if necessary and update
if self.has_item(entry_id):
item = self._items[entry_id]
else:
item = NewsItem(self, entry_id)
self._items[entry_id] = item
new_items.append(item)
item.update(entry)
feed_items.append(entry_id)
# Hide excess items the first time through
if self.last_updated is None and self._planet.new_feed_items \
and len(feed_items) > self._planet.new_feed_items:
item.hidden = "yes"
log.debug("Marked <%s> as hidden (new feed)", entry_id)
# Assign order numbers in reverse
new_items.reverse()
for item in new_items:
item.order = self.next_order = str(int(self.next_order) + 1)
# Check for expired or replaced items
feed_count = len(feed_items)
log.debug("Items in Feed: %d", feed_count)
for item in self.items(sorted=1):
if feed_count < 1:
break
elif item.id in feed_items:
feed_count -= 1
elif item._channel.url_status != '226':
del(self._items[item.id])
self._expired.append(item)
log.debug("Removed expired or replaced item <%s>", item.id)
def get_name(self, key):
"""Return the key containing the name."""
for key in ("name", "title"):
if self.has_key(key) and self.key_type(key) != self.NULL:
return self.get_as_string(key)
return ""
class NewsItem(cache.CachedInfo):
"""An item of news.
This class represents a single item of news on a channel. They're
created by members of the Channel class and accessible through it.
Properties:
id Channel-unique identifier for this item.
id_hash Relatively short, printable cryptographic hash of id
date Corrected UTC-Normalised update time, for sorting.
order Order in which items on the same date can be sorted.
hidden Item should be hidden (True if exists).
title One-line title (*).
link Link to the original format text (*).
summary Short first-page summary (*).
content Full HTML content.
modified Date the item claims to have been modified (*).
issued Date the item claims to have been issued (*).
created Date the item claims to have been created (*).
expired Date the item claims to expire (*).
author Name of the author (*).
publisher Name of the publisher (*).
category Category name (*).
comments Link to a page to enter comments (*).
license Link to the licence for the content (*).
source_name Name of the original source of this item (*).
source_link Link to the original source of this item (*).
Properties marked (*) will only be present if the original feed
contained them. Note that the various optional date fields are
simply claims made by the item and parsed from the information
given, 'date' is a far more reliable source of information.
Some feeds may define additional properties to those above.
"""
IGNORE_KEYS = ("categories", "contributors", "enclosures", "links",
"guidislink", "date", "tags")
def __init__(self, channel, id_):
cache.CachedInfo.__init__(self, channel._cache, id_)
self._channel = channel
self.id = id_
self.id_hash = md5.new(id_).hexdigest()
self.date = None
self.order = None
self.content = None
self.cache_read()
def update(self, entry):
"""Update the item from the feedparser entry given."""
for key in entry.keys():
if key in self.IGNORE_KEYS or key + "_parsed" in self.IGNORE_KEYS:
# Ignored fields
pass
elif entry.has_key(key + "_parsed"):
# Ignore unparsed date fields
pass
elif key.endswith("_detail"):
# retain name, email, and language sub-fields
if entry[key].has_key('name') and entry[key].name:
self.set_as_string(key.replace("_detail","_name"), \
entry[key].name)
if entry[key].has_key('email') and entry[key].email:
self.set_as_string(key.replace("_detail","_email"), \
entry[key].email)
if entry[key].has_key('language') and entry[key].language and \
(not self._channel.has_key('language') or \
entry[key].language != self._channel.language):
self.set_as_string(key.replace("_detail","_language"), \
entry[key].language)
elif key.endswith("_parsed"):
# Date fields
if entry[key] is not None:
self.set_as_date(key[:-len("_parsed")], entry[key])
elif key == "source":
# Source field: save both url and value
if entry[key].has_key("value"):
self.set_as_string(key + "_name", entry[key].value)
if entry[key].has_key("url"):
self.set_as_string(key + "_link", entry[key].url)
elif key == "content":
# Content field: concatenate the values
value = ""
for item in entry[key]:
if item.type == 'text/html':
item.value = sanitize.HTML(item.value)
elif item.type == 'text/plain':
item.value = xml.sax.saxutils.escape(item.value)
if item.has_key('language') and item.language and \
(not self._channel.has_key('language') or
item.language != self._channel.language) :
self.set_as_string(key + "_language", item.language)
value += cache.utf8(item.value)
self.set_as_string(key, value)
elif isinstance(entry[key], (str, unicode)):
# String fields
try:
detail = key + '_detail'
if entry.has_key(detail):
if entry[detail].has_key('type'):
if entry[detail].type == 'text/html':
entry[key] = sanitize.HTML(entry[key])
elif entry[detail].type == 'text/plain':
entry[key] = xml.sax.saxutils.escape(entry[key])
self.set_as_string(key, entry[key])
except KeyboardInterrupt:
raise
except:
log.exception("Ignored '%s' of <%s>, unknown format",
key, self.id)
# Generate the date field if we need to
self.get_date("date")
def get_date(self, key):
"""Get (or update) the date key.
We check whether the date the entry claims to have been changed is
since we last updated this feed and when we pulled the feed off the
site.
If it is then it's probably not bogus, and we'll sort accordingly.
If it isn't then we bound it appropriately, this ensures that
entries appear in posting sequence but don't overlap entries
added in previous updates and don't creep into the next one.
"""
for other_key in ("updated", "modified", "published", "issued", "created"):
if self.has_key(other_key):
date = self.get_as_date(other_key)
break
else:
date = None
if date is not None:
if date > self._channel.updated:
date = self._channel.updated
# elif date < self._channel.last_updated:
# date = self._channel.updated
elif self.has_key(key) and self.key_type(key) != self.NULL:
return self.get_as_date(key)
else:
date = self._channel.updated
self.set_as_date(key, date)
return date
def get_content(self, key):
"""Return the key containing the content."""
for key in ("content", "tagline", "summary"):
if self.has_key(key) and self.key_type(key) != self.NULL:
return self.get_as_string(key)
return ""

953
planet/__init__.py.orig Normal file
View File

@ -0,0 +1,953 @@
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
"""Planet aggregator library.
This package is a library for developing web sites or software that
aggregate RSS, CDF and Atom feeds taken from elsewhere into a single,
combined feed.
"""
__version__ = "2.0"
__authors__ = [ "Scott James Remnant <scott@netsplit.com>",
"Jeff Waugh <jdub@perkypants.org>" ]
__license__ = "Python"
# Modules available without separate import
import cache
import feedparser
import sanitize
import htmltmpl
import sgmllib
try:
import logging
except:
import compat_logging as logging
# Limit the effect of "from planet import *"
__all__ = ("cache", "feedparser", "htmltmpl", "logging",
"Planet", "Channel", "NewsItem")
import os
import md5
import time
import dbhash
import re
try:
from xml.sax.saxutils import escape
except:
def escape(data):
return data.replace("&","&amp;").replace(">","&gt;").replace("<","&lt;")
# Version information (for generator headers)
VERSION = ("Planet/%s +http://www.planetplanet.org" % __version__)
# Default User-Agent header to send when retreiving feeds
USER_AGENT = VERSION + " " + feedparser.USER_AGENT
# Default cache directory
CACHE_DIRECTORY = "cache"
# Default number of items to display from a new feed
NEW_FEED_ITEMS = 10
# Useful common date/time formats
TIMEFMT_ISO = "%Y-%m-%dT%H:%M:%S+00:00"
TIMEFMT_822 = "%a, %d %b %Y %H:%M:%S +0000"
# Log instance to use here
log = logging.getLogger("planet")
try:
log.warning
except:
log.warning = log.warn
# Defaults for the template file config sections
ENCODING = "utf-8"
ITEMS_PER_PAGE = 60
DAYS_PER_PAGE = 0
OUTPUT_DIR = "output"
DATE_FORMAT = "%B %d, %Y %I:%M %p"
NEW_DATE_FORMAT = "%B %d, %Y"
ACTIVITY_THRESHOLD = 0
class stripHtml(sgmllib.SGMLParser):
"remove all tags from the data"
def __init__(self, data):
sgmllib.SGMLParser.__init__(self)
self.result=''
self.feed(data)
self.close()
def handle_data(self, data):
if data: self.result+=data
def template_info(item, date_format):
"""Produce a dictionary of template information."""
info = {}
for key in item.keys():
if item.key_type(key) == item.DATE:
date = item.get_as_date(key)
info[key] = time.strftime(date_format, date)
info[key + "_iso"] = time.strftime(TIMEFMT_ISO, date)
info[key + "_822"] = time.strftime(TIMEFMT_822, date)
else:
info[key] = item[key]
if 'title' in item.keys():
info['title_plain'] = stripHtml(info['title']).result
return info
class Planet:
"""A set of channels.
This class represents a set of channels for which the items will
be aggregated together into one combined feed.
Properties:
user_agent User-Agent header to fetch feeds with.
cache_directory Directory to store cached channels in.
new_feed_items Number of items to display from a new feed.
filter A regular expression that articles must match.
exclude A regular expression that articles must not match.
"""
def __init__(self, config):
self.config = config
self._channels = []
self.user_agent = USER_AGENT
self.cache_directory = CACHE_DIRECTORY
self.new_feed_items = NEW_FEED_ITEMS
self.filter = None
self.exclude = None
def tmpl_config_get(self, template, option, default=None, raw=0, vars=None):
"""Get a template value from the configuration, with a default."""
if self.config.has_option(template, option):
return self.config.get(template, option, raw=raw, vars=None)
elif self.config.has_option("Planet", option):
return self.config.get("Planet", option, raw=raw, vars=None)
else:
return default
def gather_channel_info(self, template_file="Planet"):
date_format = self.tmpl_config_get(template_file,
"date_format", DATE_FORMAT, raw=1)
activity_threshold = int(self.tmpl_config_get(template_file,
"activity_threshold",
ACTIVITY_THRESHOLD))
if activity_threshold:
activity_horizon = \
time.gmtime(time.time()-86400*activity_threshold)
else:
activity_horizon = 0
channels = {}
channels_list = []
for channel in self.channels(hidden=1):
channels[channel] = template_info(channel, date_format)
channels_list.append(channels[channel])
# identify inactive feeds
if activity_horizon:
latest = channel.items(sorted=1)
if len(latest)==0 or latest[0].date < activity_horizon:
channels[channel]["message"] = \
"no activity in %d days" % activity_threshold
# report channel level errors
if not channel.url_status: continue
status = int(channel.url_status)
if status == 403:
channels[channel]["message"] = "403: forbidden"
elif status == 404:
channels[channel]["message"] = "404: not found"
elif status == 408:
channels[channel]["message"] = "408: request timeout"
elif status == 410:
channels[channel]["message"] = "410: gone"
elif status == 500:
channels[channel]["message"] = "internal server error"
elif status >= 400:
channels[channel]["message"] = "http status %s" % status
return channels, channels_list
def gather_items_info(self, channels, template_file="Planet", channel_list=None):
items_list = []
prev_date = []
prev_channel = None
date_format = self.tmpl_config_get(template_file,
"date_format", DATE_FORMAT, raw=1)
items_per_page = int(self.tmpl_config_get(template_file,
"items_per_page", ITEMS_PER_PAGE))
days_per_page = int(self.tmpl_config_get(template_file,
"days_per_page", DAYS_PER_PAGE))
new_date_format = self.tmpl_config_get(template_file,
"new_date_format", NEW_DATE_FORMAT, raw=1)
for newsitem in self.items(max_items=items_per_page,
max_days=days_per_page,
channels=channel_list):
item_info = template_info(newsitem, date_format)
chan_info = channels[newsitem._channel]
for k, v in chan_info.items():
item_info["channel_" + k] = v
# Check for the start of a new day
if prev_date[:3] != newsitem.date[:3]:
prev_date = newsitem.date
item_info["new_date"] = time.strftime(new_date_format,
newsitem.date)
# Check for the start of a new channel
if item_info.has_key("new_date") \
or prev_channel != newsitem._channel:
prev_channel = newsitem._channel
item_info["new_channel"] = newsitem._channel.url
items_list.append(item_info)
return items_list
def run(self, planet_name, planet_link, template_files, offline = False):
log = logging.getLogger("planet.runner")
# Create a planet
log.info("Loading cached data")
if self.config.has_option("Planet", "cache_directory"):
self.cache_directory = self.config.get("Planet", "cache_directory")
if self.config.has_option("Planet", "new_feed_items"):
self.new_feed_items = int(self.config.get("Planet", "new_feed_items"))
self.user_agent = "%s +%s %s" % (planet_name, planet_link,
self.user_agent)
if self.config.has_option("Planet", "filter"):
self.filter = self.config.get("Planet", "filter")
# The other configuration blocks are channels to subscribe to
for feed_url in self.config.sections():
if feed_url == "Planet" or feed_url in template_files:
continue
# Create a channel, configure it and subscribe it
channel = Channel(self, feed_url)
self.subscribe(channel)
# Update it
try:
if not offline and not channel.url_status == '410':
channel.update()
except KeyboardInterrupt:
raise
except:
log.exception("Update of <%s> failed", feed_url)
def generate_all_files(self, template_files, planet_name,
planet_link, planet_feed, owner_name, owner_email):
log = logging.getLogger("planet.runner")
# Go-go-gadget-template
for template_file in template_files:
manager = htmltmpl.TemplateManager()
log.info("Processing template %s", template_file)
try:
template = manager.prepare(template_file)
except htmltmpl.TemplateError:
template = manager.prepare(os.path.basename(template_file))
# Read the configuration
output_dir = self.tmpl_config_get(template_file,
"output_dir", OUTPUT_DIR)
date_format = self.tmpl_config_get(template_file,
"date_format", DATE_FORMAT, raw=1)
encoding = self.tmpl_config_get(template_file, "encoding", ENCODING)
# We treat each template individually
base = os.path.splitext(os.path.basename(template_file))[0]
url = os.path.join(planet_link, base)
output_file = os.path.join(output_dir, base)
# Gather information
channels, channels_list = self.gather_channel_info(template_file)
items_list = self.gather_items_info(channels, template_file)
# Gather item information
# Process the template
tp = htmltmpl.TemplateProcessor(html_escape=0)
tp.set("Items", items_list)
tp.set("Channels", channels_list)
# Generic information
tp.set("generator", VERSION)
tp.set("name", planet_name)
tp.set("link", planet_link)
tp.set("owner_name", owner_name)
tp.set("owner_email", owner_email)
tp.set("url", url)
if planet_feed:
tp.set("feed", planet_feed)
tp.set("feedtype", planet_feed.find('rss')>=0 and 'rss' or 'atom')
# Update time
date = time.gmtime()
tp.set("date", time.strftime(date_format, date))
tp.set("date_iso", time.strftime(TIMEFMT_ISO, date))
tp.set("date_822", time.strftime(TIMEFMT_822, date))
try:
log.info("Writing %s", output_file)
output_fd = open(output_file, "w")
if encoding.lower() in ("utf-8", "utf8"):
# UTF-8 output is the default because we use that internally
output_fd.write(tp.process(template))
elif encoding.lower() in ("xml", "html", "sgml"):
# Magic for Python 2.3 users
output = tp.process(template).decode("utf-8")
output_fd.write(output.encode("ascii", "xmlcharrefreplace"))
else:
# Must be a "known" encoding
output = tp.process(template).decode("utf-8")
output_fd.write(output.encode(encoding, "replace"))
output_fd.close()
except KeyboardInterrupt:
raise
except:
log.exception("Write of %s failed", output_file)
def channels(self, hidden=0, sorted=1):
"""Return the list of channels."""
channels = []
for channel in self._channels:
if hidden or not channel.has_key("hidden"):
channels.append((channel.name, channel))
if sorted:
channels.sort()
return [ c[-1] for c in channels ]
def find_by_basename(self, basename):
for channel in self._channels:
if basename == channel.cache_basename(): return channel
def subscribe(self, channel):
"""Subscribe the planet to the channel."""
self._channels.append(channel)
def unsubscribe(self, channel):
"""Unsubscribe the planet from the channel."""
self._channels.remove(channel)
def items(self, hidden=0, sorted=1, max_items=0, max_days=0, channels=None):
"""Return an optionally filtered list of items in the channel.
The filters are applied in the following order:
If hidden is true then items in hidden channels and hidden items
will be returned.
If sorted is true then the item list will be sorted with the newest
first.
If max_items is non-zero then this number of items, at most, will
be returned.
If max_days is non-zero then any items older than the newest by
this number of days won't be returned. Requires sorted=1 to work.
The sharp-eyed will note that this looks a little strange code-wise,
it turns out that Python gets *really* slow if we try to sort the
actual items themselves. Also we use mktime here, but it's ok
because we discard the numbers and just need them to be relatively
consistent between each other.
"""
planet_filter_re = None
if self.filter:
planet_filter_re = re.compile(self.filter, re.I)
planet_exclude_re = None
if self.exclude:
planet_exclude_re = re.compile(self.exclude, re.I)
items = []
seen_guids = {}
if not channels: channels=self.channels(hidden=hidden, sorted=0)
for channel in channels:
for item in channel._items.values():
if hidden or not item.has_key("hidden"):
channel_filter_re = None
if channel.filter:
channel_filter_re = re.compile(channel.filter,
re.I)
channel_exclude_re = None
if channel.exclude:
channel_exclude_re = re.compile(channel.exclude,
re.I)
if (planet_filter_re or planet_exclude_re \
or channel_filter_re or channel_exclude_re):
title = ""
if item.has_key("title"):
title = item.title
content = item.get_content("content")
if planet_filter_re:
if not (planet_filter_re.search(title) \
or planet_filter_re.search(content)):
continue
if planet_exclude_re:
if (planet_exclude_re.search(title) \
or planet_exclude_re.search(content)):
continue
if channel_filter_re:
if not (channel_filter_re.search(title) \
or channel_filter_re.search(content)):
continue
if channel_exclude_re:
if (channel_exclude_re.search(title) \
or channel_exclude_re.search(content)):
continue
if not seen_guids.has_key(item.id):
seen_guids[item.id] = 1;
items.append((time.mktime(item.date), item.order, item))
# Sort the list
if sorted:
items.sort()
items.reverse()
# Apply max_items filter
if len(items) and max_items:
items = items[:max_items]
# Apply max_days filter
if len(items) and max_days:
max_count = 0
max_time = items[0][0] - max_days * 84600
for item in items:
if item[0] > max_time:
max_count += 1
else:
items = items[:max_count]
break
return [ i[-1] for i in items ]
class Channel(cache.CachedInfo):
"""A list of news items.
This class represents a list of news items taken from the feed of
a website or other source.
Properties:
url URL of the feed.
url_etag E-Tag of the feed URL.
url_modified Last modified time of the feed URL.
url_status Last HTTP status of the feed URL.
hidden Channel should be hidden (True if exists).
name Name of the feed owner, or feed title.
next_order Next order number to be assigned to NewsItem
updated Correct UTC-Normalised update time of the feed.
last_updated Correct UTC-Normalised time the feed was last updated.
id An identifier the feed claims is unique (*).
title One-line title (*).
link Link to the original format feed (*).
tagline Short description of the feed (*).
info Longer description of the feed (*).
modified Date the feed claims to have been modified (*).
author Name of the author (*).
publisher Name of the publisher (*).
generator Name of the feed generator (*).
category Category name (*).
copyright Copyright information for humans to read (*).
license Link to the licence for the content (*).
docs Link to the specification of the feed format (*).
language Primary language (*).
errorreportsto E-Mail address to send error reports to (*).
image_url URL of an associated image (*).
image_link Link to go with the associated image (*).
image_title Alternative text of the associated image (*).
image_width Width of the associated image (*).
image_height Height of the associated image (*).
filter A regular expression that articles must match.
exclude A regular expression that articles must not match.
Properties marked (*) will only be present if the original feed
contained them. Note that the optional 'modified' date field is simply
a claim made by the item and parsed from the information given, 'updated'
(and 'last_updated') are far more reliable sources of information.
Some feeds may define additional properties to those above.
"""
IGNORE_KEYS = ("links", "contributors", "textinput", "cloud", "categories",
"url", "href", "url_etag", "url_modified", "tags", "itunes_explicit")
def __init__(self, planet, url):
if not os.path.isdir(planet.cache_directory):
os.makedirs(planet.cache_directory)
cache_filename = cache.filename(planet.cache_directory, url)
cache_file = dbhash.open(cache_filename, "c", 0666)
cache.CachedInfo.__init__(self, cache_file, url, root=1)
self._items = {}
self._planet = planet
self._expired = []
self.url = url
# retain the original URL for error reporting
self.configured_url = url
self.url_etag = None
self.url_status = None
self.url_modified = None
self.name = None
self.updated = None
self.last_updated = None
self.filter = None
self.exclude = None
self.next_order = "0"
self.cache_read()
self.cache_read_entries()
if planet.config.has_section(url):
for option in planet.config.options(url):
value = planet.config.get(url, option)
self.set_as_string(option, value, cached=0)
def has_item(self, id_):
"""Check whether the item exists in the channel."""
return self._items.has_key(id_)
def get_item(self, id_):
"""Return the item from the channel."""
return self._items[id_]
# Special methods
__contains__ = has_item
def items(self, hidden=0, sorted=0):
"""Return the item list."""
items = []
for item in self._items.values():
if hidden or not item.has_key("hidden"):
items.append((time.mktime(item.date), item.order, item))
if sorted:
items.sort()
items.reverse()
return [ i[-1] for i in items ]
def __iter__(self):
"""Iterate the sorted item list."""
return iter(self.items(sorted=1))
def cache_read_entries(self):
"""Read entry information from the cache."""
keys = self._cache.keys()
for key in keys:
if key.find(" ") != -1: continue
if self.has_key(key): continue
item = NewsItem(self, key)
self._items[key] = item
def cache_basename(self):
return cache.filename('',self._id)
def cache_write(self, sync=1):
"""Write channel and item information to the cache."""
for item in self._items.values():
item.cache_write(sync=0)
for item in self._expired:
item.cache_clear(sync=0)
cache.CachedInfo.cache_write(self, sync)
self._expired = []
def feed_information(self):
"""
Returns a description string for the feed embedded in this channel.
This will usually simply be the feed url embedded in <>, but in the
case where the current self.url has changed from the original
self.configured_url the string will contain both pieces of information.
This is so that the URL in question is easier to find in logging
output: getting an error about a URL that doesn't appear in your config
file is annoying.
"""
if self.url == self.configured_url:
return "<%s>" % self.url
else:
return "<%s> (formerly <%s>)" % (self.url, self.configured_url)
def update(self):
"""Download the feed to refresh the information.
This does the actual work of pulling down the feed and if it changes
updates the cached information about the feed and entries within it.
"""
info = feedparser.parse(self.url,
etag=self.url_etag, modified=self.url_modified,
agent=self._planet.user_agent)
if info.has_key("status"):
self.url_status = str(info.status)
elif info.has_key("entries") and len(info.entries)>0:
self.url_status = str(200)
elif info.bozo and info.bozo_exception.__class__.__name__=='Timeout':
self.url_status = str(408)
else:
self.url_status = str(500)
if self.url_status == '301' and \
(info.has_key("entries") and len(info.entries)>0):
log.warning("Feed has moved from <%s> to <%s>", self.url, info.url)
try:
os.link(cache.filename(self._planet.cache_directory, self.url),
cache.filename(self._planet.cache_directory, info.url))
except:
pass
self.url = info.url
elif self.url_status == '304':
log.info("Feed %s unchanged", self.feed_information())
return
elif self.url_status == '410':
log.info("Feed %s gone", self.feed_information())
self.cache_write()
return
elif self.url_status == '408':
log.warning("Feed %s timed out", self.feed_information())
return
elif int(self.url_status) >= 400:
log.error("Error %s while updating feed %s",
self.url_status, self.feed_information())
return
else:
log.info("Updating feed %s", self.feed_information())
self.url_etag = info.has_key("etag") and info.etag or None
self.url_modified = info.has_key("modified") and info.modified or None
if self.url_etag is not None:
log.debug("E-Tag: %s", self.url_etag)
if self.url_modified is not None:
log.debug("Last Modified: %s",
time.strftime(TIMEFMT_ISO, self.url_modified))
self.update_info(info.feed)
self.update_entries(info.entries)
self.cache_write()
def update_info(self, feed):
"""Update information from the feed.
This reads the feed information supplied by feedparser and updates
the cached information about the feed. These are the various
potentially interesting properties that you might care about.
"""
for key in feed.keys():
if key in self.IGNORE_KEYS or key + "_parsed" in self.IGNORE_KEYS:
# Ignored fields
pass
elif feed.has_key(key + "_parsed"):
# Ignore unparsed date fields
pass
elif key.endswith("_detail"):
# retain name and email sub-fields
if feed[key].has_key('name') and feed[key].name:
self.set_as_string(key.replace("_detail","_name"), \
feed[key].name)
if feed[key].has_key('email') and feed[key].email:
self.set_as_string(key.replace("_detail","_email"), \
feed[key].email)
elif key == "items":
# Ignore items field
pass
elif key.endswith("_parsed"):
# Date fields
if feed[key] is not None:
self.set_as_date(key[:-len("_parsed")], feed[key])
elif key == "image":
# Image field: save all the information
if feed[key].has_key("url"):
self.set_as_string(key + "_url", feed[key].url)
if feed[key].has_key("link"):
self.set_as_string(key + "_link", feed[key].link)
if feed[key].has_key("title"):
self.set_as_string(key + "_title", feed[key].title)
if feed[key].has_key("width"):
self.set_as_string(key + "_width", str(feed[key].width))
if feed[key].has_key("height"):
self.set_as_string(key + "_height", str(feed[key].height))
elif isinstance(feed[key], (str, unicode)):
# String fields
try:
detail = key + '_detail'
if feed.has_key(detail) and feed[detail].has_key('type'):
if feed[detail].type == 'text/html':
feed[key] = sanitize.HTML(feed[key])
elif feed[detail].type == 'text/plain':
feed[key] = escape(feed[key])
self.set_as_string(key, feed[key])
except KeyboardInterrupt:
raise
except:
log.exception("Ignored '%s' of <%s>, unknown format",
key, self.url)
def update_entries(self, entries):
"""Update entries from the feed.
This reads the entries supplied by feedparser and updates the
cached information about them. It's at this point we update
the 'updated' timestamp and keep the old one in 'last_updated',
these provide boundaries for acceptable entry times.
If this is the first time a feed has been updated then most of the
items will be marked as hidden, according to Planet.new_feed_items.
If the feed does not contain items which, according to the sort order,
should be there; those items are assumed to have been expired from
the feed or replaced and are removed from the cache.
"""
if not len(entries):
return
self.last_updated = self.updated
self.updated = time.gmtime()
new_items = []
feed_items = []
for entry in entries:
# Try really hard to find some kind of unique identifier
if entry.has_key("id"):
entry_id = cache.utf8(entry.id)
elif entry.has_key("link"):
entry_id = cache.utf8(entry.link)
elif entry.has_key("title"):
entry_id = (self.url + "/"
+ md5.new(cache.utf8(entry.title)).hexdigest())
elif entry.has_key("summary"):
entry_id = (self.url + "/"
+ md5.new(cache.utf8(entry.summary)).hexdigest())
else:
log.error("Unable to find or generate id, entry ignored")
continue
# Create the item if necessary and update
if self.has_item(entry_id):
item = self._items[entry_id]
else:
item = NewsItem(self, entry_id)
self._items[entry_id] = item
new_items.append(item)
item.update(entry)
feed_items.append(entry_id)
# Hide excess items the first time through
if self.last_updated is None and self._planet.new_feed_items \
and len(feed_items) > self._planet.new_feed_items:
item.hidden = "yes"
log.debug("Marked <%s> as hidden (new feed)", entry_id)
# Assign order numbers in reverse
new_items.reverse()
for item in new_items:
item.order = self.next_order = str(int(self.next_order) + 1)
# Check for expired or replaced items
feed_count = len(feed_items)
log.debug("Items in Feed: %d", feed_count)
for item in self.items(sorted=1):
if feed_count < 1:
break
elif item.id in feed_items:
feed_count -= 1
elif item._channel.url_status != '226':
del(self._items[item.id])
self._expired.append(item)
log.debug("Removed expired or replaced item <%s>", item.id)
def get_name(self, key):
"""Return the key containing the name."""
for key in ("name", "title"):
if self.has_key(key) and self.key_type(key) != self.NULL:
return self.get_as_string(key)
return ""
class NewsItem(cache.CachedInfo):
"""An item of news.
This class represents a single item of news on a channel. They're
created by members of the Channel class and accessible through it.
Properties:
id Channel-unique identifier for this item.
id_hash Relatively short, printable cryptographic hash of id
date Corrected UTC-Normalised update time, for sorting.
order Order in which items on the same date can be sorted.
hidden Item should be hidden (True if exists).
title One-line title (*).
link Link to the original format text (*).
summary Short first-page summary (*).
content Full HTML content.
modified Date the item claims to have been modified (*).
issued Date the item claims to have been issued (*).
created Date the item claims to have been created (*).
expired Date the item claims to expire (*).
author Name of the author (*).
publisher Name of the publisher (*).
category Category name (*).
comments Link to a page to enter comments (*).
license Link to the licence for the content (*).
source_name Name of the original source of this item (*).
source_link Link to the original source of this item (*).
Properties marked (*) will only be present if the original feed
contained them. Note that the various optional date fields are
simply claims made by the item and parsed from the information
given, 'date' is a far more reliable source of information.
Some feeds may define additional properties to those above.
"""
IGNORE_KEYS = ("categories", "contributors", "enclosures", "links",
"guidislink", "date", "tags")
def __init__(self, channel, id_):
cache.CachedInfo.__init__(self, channel._cache, id_)
self._channel = channel
self.id = id_
self.id_hash = md5.new(id_).hexdigest()
self.date = None
self.order = None
self.content = None
self.cache_read()
def update(self, entry):
"""Update the item from the feedparser entry given."""
for key in entry.keys():
if key in self.IGNORE_KEYS or key + "_parsed" in self.IGNORE_KEYS:
# Ignored fields
pass
elif entry.has_key(key + "_parsed"):
# Ignore unparsed date fields
pass
elif key.endswith("_detail"):
# retain name, email, and language sub-fields
if entry[key].has_key('name') and entry[key].name:
self.set_as_string(key.replace("_detail","_name"), \
entry[key].name)
if entry[key].has_key('email') and entry[key].email:
self.set_as_string(key.replace("_detail","_email"), \
entry[key].email)
if entry[key].has_key('language') and entry[key].language and \
(not self._channel.has_key('language') or \
entry[key].language != self._channel.language):
self.set_as_string(key.replace("_detail","_language"), \
entry[key].language)
elif key.endswith("_parsed"):
# Date fields
if entry[key] is not None:
self.set_as_date(key[:-len("_parsed")], entry[key])
elif key == "source":
# Source field: save both url and value
if entry[key].has_key("value"):
self.set_as_string(key + "_name", entry[key].value)
if entry[key].has_key("url"):
self.set_as_string(key + "_link", entry[key].url)
elif key == "content":
# Content field: concatenate the values
value = ""
for item in entry[key]:
if item.type == 'text/html':
item.value = sanitize.HTML(item.value)
elif item.type == 'text/plain':
item.value = escape(item.value)
if item.has_key('language') and item.language and \
(not self._channel.has_key('language') or
item.language != self._channel.language) :
self.set_as_string(key + "_language", item.language)
value += cache.utf8(item.value)
self.set_as_string(key, value)
elif isinstance(entry[key], (str, unicode)):
# String fields
try:
detail = key + '_detail'
if entry.has_key(detail):
if entry[detail].has_key('type'):
if entry[detail].type == 'text/html':
entry[key] = sanitize.HTML(entry[key])
elif entry[detail].type == 'text/plain':
entry[key] = escape(entry[key])
self.set_as_string(key, entry[key])
except KeyboardInterrupt:
raise
except:
log.exception("Ignored '%s' of <%s>, unknown format",
key, self.id)
# Generate the date field if we need to
self.get_date("date")
def get_date(self, key):
"""Get (or update) the date key.
We check whether the date the entry claims to have been changed is
since we last updated this feed and when we pulled the feed off the
site.
If it is then it's probably not bogus, and we'll sort accordingly.
If it isn't then we bound it appropriately, this ensures that
entries appear in posting sequence but don't overlap entries
added in previous updates and don't creep into the next one.
"""
for other_key in ("updated", "modified", "published", "issued", "created"):
if self.has_key(other_key):
date = self.get_as_date(other_key)
break
else:
date = None
if date is not None:
if date > self._channel.updated:
date = self._channel.updated
# elif date < self._channel.last_updated:
# date = self._channel.updated
elif self.has_key(key) and self.key_type(key) != self.NULL:
return self.get_as_date(key)
else:
date = self._channel.updated
self.set_as_date(key, date)
return date
def get_content(self, key):
"""Return the key containing the content."""
for key in ("content", "tagline", "summary"):
if self.has_key(key) and self.key_type(key) != self.NULL:
return self.get_as_string(key)
return ""

124
planet/atomstyler.py Normal file
View File

@ -0,0 +1,124 @@
from xml.dom import minidom, Node
from urlparse import urlparse, urlunparse
from xml.parsers.expat import ExpatError
from htmlentitydefs import name2codepoint
import re
# select and apply an xml:base for this entry
class relativize:
def __init__(self, parent):
self.score = {}
self.links = []
self.collect_and_tally(parent)
self.base = self.select_optimal_base()
if self.base:
if not parent.hasAttribute('xml:base'):
self.rebase(parent)
parent.setAttribute('xml:base', self.base)
# collect and tally cite, href and src attributes
def collect_and_tally(self,parent):
uri = None
if parent.hasAttribute('cite'): uri=parent.getAttribute('cite')
if parent.hasAttribute('href'): uri=parent.getAttribute('href')
if parent.hasAttribute('src'): uri=parent.getAttribute('src')
if uri:
parts=urlparse(uri)
if parts[0].lower() == 'http':
parts = (parts[1]+parts[2]).split('/')
base = None
for i in range(1,len(parts)):
base = tuple(parts[0:i])
self.score[base] = self.score.get(base,0) + len(base)
if base and base not in self.links: self.links.append(base)
for node in parent.childNodes:
if node.nodeType == Node.ELEMENT_NODE:
self.collect_and_tally(node)
# select the xml:base with the highest score
def select_optimal_base(self):
if not self.score: return None
for link in self.links:
self.score[link] = 0
winner = max(self.score.values())
if not winner: return None
for key in self.score.keys():
if self.score[key] == winner:
if winner == len(key): return None
return urlunparse(('http', key[0], '/'.join(key[1:]), '', '', '')) + '/'
# rewrite cite, href and src attributes using this base
def rebase(self,parent):
uri = None
if parent.hasAttribute('cite'): uri=parent.getAttribute('cite')
if parent.hasAttribute('href'): uri=parent.getAttribute('href')
if parent.hasAttribute('src'): uri=parent.getAttribute('src')
if uri and uri.startswith(self.base):
uri = uri[len(self.base):] or '.'
if parent.hasAttribute('href'): uri=parent.setAttribute('href', uri)
if parent.hasAttribute('src'): uri=parent.setAttribute('src', uri)
for node in parent.childNodes:
if node.nodeType == Node.ELEMENT_NODE:
self.rebase(node)
# convert type="html" to type="plain" or type="xhtml" as appropriate
def retype(parent):
for node in parent.childNodes:
if node.nodeType == Node.ELEMENT_NODE:
if node.hasAttribute('type') and node.getAttribute('type') == 'html':
if len(node.childNodes)==0:
node.removeAttribute('type')
elif len(node.childNodes)==1:
# replace html entity defs with utf-8
chunks=re.split('&(\w+);', node.childNodes[0].nodeValue)
for i in range(1,len(chunks),2):
if chunks[i] in ['amp', 'lt', 'gt', 'apos', 'quot']:
chunks[i] ='&' + chunks[i] +';'
elif chunks[i] in name2codepoint:
chunks[i]=unichr(name2codepoint[chunks[i]])
else:
chunks[i]='&' + chunks[i] + ';'
text = u"".join(chunks)
try:
# see if the resulting text is a well-formed XML fragment
div = '<div xmlns="http://www.w3.org/1999/xhtml">%s</div>'
data = minidom.parseString((div % text.encode('utf-8')))
if text.find('<') < 0:
# plain text
node.removeAttribute('type')
text = data.documentElement.childNodes[0].nodeValue
node.childNodes[0].replaceWholeText(text)
elif len(text) > 80:
# xhtml
node.setAttribute('type', 'xhtml')
node.removeChild(node.childNodes[0])
node.appendChild(data.documentElement)
except ExpatError:
# leave as html
pass
else:
# recurse
retype(node)
if parent.nodeName == 'entry':
relativize(parent)
if __name__ == '__main__':
# run styler on each file mention on the command line
import sys
for feed in sys.argv[1:]:
doc = minidom.parse(feed)
doc.normalize()
retype(doc.documentElement)
open(feed,'w').write(doc.toxml('utf-8'))

306
planet/cache.py Normal file
View File

@ -0,0 +1,306 @@
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
"""Item cache.
Between runs of Planet we need somewhere to store the feed information
we parsed, this is so we don't lose information when a particular feed
goes away or is too short to hold enough items.
This module provides the code to handle this cache transparently enough
that the rest of the code can take the persistance for granted.
"""
import os
import re
# Regular expressions to sanitise cache filenames
re_url_scheme = re.compile(r'^[^:]*://')
re_slash = re.compile(r'[?/]+')
re_initial_cruft = re.compile(r'^[,.]*')
re_final_cruft = re.compile(r'[,.]*$')
class CachedInfo:
"""Cached information.
This class is designed to hold information that is stored in a cache
between instances. It can act both as a dictionary (c['foo']) and
as an object (c.foo) to get and set values and supports both string
and date values.
If you wish to support special fields you can derive a class off this
and implement get_FIELD and set_FIELD functions which will be
automatically called.
"""
STRING = "string"
DATE = "date"
NULL = "null"
def __init__(self, cache, id_, root=0):
self._type = {}
self._value = {}
self._cached = {}
self._cache = cache
self._id = id_.replace(" ", "%20")
self._root = root
def cache_key(self, key):
"""Return the cache key name for the given key."""
key = key.replace(" ", "_")
if self._root:
return key
else:
return self._id + " " + key
def cache_read(self):
"""Read information from the cache."""
if self._root:
keys_key = " keys"
else:
keys_key = self._id
if self._cache.has_key(keys_key):
keys = self._cache[keys_key].split(" ")
else:
return
for key in keys:
cache_key = self.cache_key(key)
if not self._cached.has_key(key) or self._cached[key]:
# Key either hasn't been loaded, or is one for the cache
self._value[key] = self._cache[cache_key]
self._type[key] = self._cache[cache_key + " type"]
self._cached[key] = 1
def cache_write(self, sync=1):
"""Write information to the cache."""
self.cache_clear(sync=0)
keys = []
for key in self.keys():
cache_key = self.cache_key(key)
if not self._cached[key]:
if self._cache.has_key(cache_key):
# Non-cached keys need to be cleared
del(self._cache[cache_key])
del(self._cache[cache_key + " type"])
continue
keys.append(key)
self._cache[cache_key] = self._value[key]
self._cache[cache_key + " type"] = self._type[key]
if self._root:
keys_key = " keys"
else:
keys_key = self._id
self._cache[keys_key] = " ".join(keys)
if sync:
self._cache.sync()
def cache_clear(self, sync=1):
"""Remove information from the cache."""
if self._root:
keys_key = " keys"
else:
keys_key = self._id
if self._cache.has_key(keys_key):
keys = self._cache[keys_key].split(" ")
del(self._cache[keys_key])
else:
return
for key in keys:
cache_key = self.cache_key(key)
del(self._cache[cache_key])
del(self._cache[cache_key + " type"])
if sync:
self._cache.sync()
def has_key(self, key):
"""Check whether the key exists."""
key = key.replace(" ", "_")
return self._value.has_key(key)
def key_type(self, key):
"""Return the key type."""
key = key.replace(" ", "_")
return self._type[key]
def set(self, key, value, cached=1):
"""Set the value of the given key.
If a set_KEY function exists that is called otherwise the
string function is called and the date function if that fails
(it nearly always will).
"""
key = key.replace(" ", "_")
try:
func = getattr(self, "set_" + key)
except AttributeError:
pass
else:
return func(key, value)
if value == None:
return self.set_as_null(key, value)
else:
try:
return self.set_as_string(key, value)
except TypeError:
return self.set_as_date(key, value)
def get(self, key):
"""Return the value of the given key.
If a get_KEY function exists that is called otherwise the
correctly typed function is called if that exists.
"""
key = key.replace(" ", "_")
try:
func = getattr(self, "get_" + key)
except AttributeError:
pass
else:
return func(key)
try:
func = getattr(self, "get_as_" + self._type[key])
except AttributeError:
pass
else:
return func(key)
return self._value[key]
def set_as_string(self, key, value, cached=1):
"""Set the key to the string value.
The value is converted to UTF-8 if it is a Unicode string, otherwise
it's assumed to have failed decoding (feedparser tries pretty hard)
so has all non-ASCII characters stripped.
"""
value = utf8(value)
key = key.replace(" ", "_")
self._value[key] = value
self._type[key] = self.STRING
self._cached[key] = cached
def get_as_string(self, key):
"""Return the key as a string value."""
key = key.replace(" ", "_")
if not self.has_key(key):
raise KeyError, key
return self._value[key]
def set_as_date(self, key, value, cached=1):
"""Set the key to the date value.
The date should be a 9-item tuple as returned by time.gmtime().
"""
value = " ".join([ str(s) for s in value ])
key = key.replace(" ", "_")
self._value[key] = value
self._type[key] = self.DATE
self._cached[key] = cached
def get_as_date(self, key):
"""Return the key as a date value."""
key = key.replace(" ", "_")
if not self.has_key(key):
raise KeyError, key
value = self._value[key]
return tuple([ int(i) for i in value.split(" ") ])
def set_as_null(self, key, value, cached=1):
"""Set the key to the null value.
This only exists to make things less magic.
"""
key = key.replace(" ", "_")
self._value[key] = ""
self._type[key] = self.NULL
self._cached[key] = cached
def get_as_null(self, key):
"""Return the key as the null value."""
key = key.replace(" ", "_")
if not self.has_key(key):
raise KeyError, key
return None
def del_key(self, key):
"""Delete the given key."""
key = key.replace(" ", "_")
if not self.has_key(key):
raise KeyError, key
del(self._value[key])
del(self._type[key])
del(self._cached[key])
def keys(self):
"""Return the list of cached keys."""
return self._value.keys()
def __iter__(self):
"""Iterate the cached keys."""
return iter(self._value.keys())
# Special methods
__contains__ = has_key
__setitem__ = set_as_string
__getitem__ = get
__delitem__ = del_key
__delattr__ = del_key
def __setattr__(self, key, value):
if key.startswith("_"):
self.__dict__[key] = value
else:
self.set(key, value)
def __getattr__(self, key):
if self.has_key(key):
return self.get(key)
else:
raise AttributeError, key
def filename(directory, filename):
"""Return a filename suitable for the cache.
Strips dangerous and common characters to create a filename we
can use to store the cache in.
"""
filename = re_url_scheme.sub("", filename)
filename = re_slash.sub(",", filename)
filename = re_initial_cruft.sub("", filename)
filename = re_final_cruft.sub("", filename)
return os.path.join(directory, filename)
def utf8(value):
"""Return the value as a UTF-8 string."""
if type(value) == type(u''):
return value.encode("utf-8")
else:
try:
return unicode(value, "utf-8").encode("utf-8")
except UnicodeError:
try:
return unicode(value, "iso-8859-1").encode("utf-8")
except UnicodeError:
return unicode(value, "ascii", "replace").encode("utf-8")

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,299 @@
# Copyright 2001-2002 by Vinay Sajip. All Rights Reserved.
#
# Permission to use, copy, modify, and distribute this software and its
# documentation for any purpose and without fee is hereby granted,
# provided that the above copyright notice appear in all copies and that
# both that copyright notice and this permission notice appear in
# supporting documentation, and that the name of Vinay Sajip
# not be used in advertising or publicity pertaining to distribution
# of the software without specific, written prior permission.
# VINAY SAJIP DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING
# ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL
# VINAY SAJIP BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR
# ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
# IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
# OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
"""
Logging package for Python. Based on PEP 282 and comments thereto in
comp.lang.python, and influenced by Apache's log4j system.
Should work under Python versions >= 1.5.2, except that source line
information is not available unless 'inspect' is.
Copyright (C) 2001-2002 Vinay Sajip. All Rights Reserved.
To use, simply 'import logging' and log away!
"""
import sys, logging, logging.handlers, string, thread, threading, socket, struct, os
from SocketServer import ThreadingTCPServer, StreamRequestHandler
DEFAULT_LOGGING_CONFIG_PORT = 9030
if sys.platform == "win32":
RESET_ERROR = 10054 #WSAECONNRESET
else:
RESET_ERROR = 104 #ECONNRESET
#
# The following code implements a socket listener for on-the-fly
# reconfiguration of logging.
#
# _listener holds the server object doing the listening
_listener = None
def fileConfig(fname, defaults=None):
"""
Read the logging configuration from a ConfigParser-format file.
This can be called several times from an application, allowing an end user
the ability to select from various pre-canned configurations (if the
developer provides a mechanism to present the choices and load the chosen
configuration).
In versions of ConfigParser which have the readfp method [typically
shipped in 2.x versions of Python], you can pass in a file-like object
rather than a filename, in which case the file-like object will be read
using readfp.
"""
import ConfigParser
cp = ConfigParser.ConfigParser(defaults)
if hasattr(cp, 'readfp') and hasattr(fname, 'readline'):
cp.readfp(fname)
else:
cp.read(fname)
#first, do the formatters...
flist = cp.get("formatters", "keys")
if len(flist):
flist = string.split(flist, ",")
formatters = {}
for form in flist:
sectname = "formatter_%s" % form
opts = cp.options(sectname)
if "format" in opts:
fs = cp.get(sectname, "format", 1)
else:
fs = None
if "datefmt" in opts:
dfs = cp.get(sectname, "datefmt", 1)
else:
dfs = None
f = logging.Formatter(fs, dfs)
formatters[form] = f
#next, do the handlers...
#critical section...
logging._acquireLock()
try:
try:
#first, lose the existing handlers...
logging._handlers.clear()
#now set up the new ones...
hlist = cp.get("handlers", "keys")
if len(hlist):
hlist = string.split(hlist, ",")
handlers = {}
fixups = [] #for inter-handler references
for hand in hlist:
sectname = "handler_%s" % hand
klass = cp.get(sectname, "class")
opts = cp.options(sectname)
if "formatter" in opts:
fmt = cp.get(sectname, "formatter")
else:
fmt = ""
klass = eval(klass, vars(logging))
args = cp.get(sectname, "args")
args = eval(args, vars(logging))
h = apply(klass, args)
if "level" in opts:
level = cp.get(sectname, "level")
h.setLevel(logging._levelNames[level])
if len(fmt):
h.setFormatter(formatters[fmt])
#temporary hack for FileHandler and MemoryHandler.
if klass == logging.handlers.MemoryHandler:
if "target" in opts:
target = cp.get(sectname,"target")
else:
target = ""
if len(target): #the target handler may not be loaded yet, so keep for later...
fixups.append((h, target))
handlers[hand] = h
#now all handlers are loaded, fixup inter-handler references...
for fixup in fixups:
h = fixup[0]
t = fixup[1]
h.setTarget(handlers[t])
#at last, the loggers...first the root...
llist = cp.get("loggers", "keys")
llist = string.split(llist, ",")
llist.remove("root")
sectname = "logger_root"
root = logging.root
log = root
opts = cp.options(sectname)
if "level" in opts:
level = cp.get(sectname, "level")
log.setLevel(logging._levelNames[level])
for h in root.handlers[:]:
root.removeHandler(h)
hlist = cp.get(sectname, "handlers")
if len(hlist):
hlist = string.split(hlist, ",")
for hand in hlist:
log.addHandler(handlers[hand])
#and now the others...
#we don't want to lose the existing loggers,
#since other threads may have pointers to them.
#existing is set to contain all existing loggers,
#and as we go through the new configuration we
#remove any which are configured. At the end,
#what's left in existing is the set of loggers
#which were in the previous configuration but
#which are not in the new configuration.
existing = root.manager.loggerDict.keys()
#now set up the new ones...
for log in llist:
sectname = "logger_%s" % log
qn = cp.get(sectname, "qualname")
opts = cp.options(sectname)
if "propagate" in opts:
propagate = cp.getint(sectname, "propagate")
else:
propagate = 1
logger = logging.getLogger(qn)
if qn in existing:
existing.remove(qn)
if "level" in opts:
level = cp.get(sectname, "level")
logger.setLevel(logging._levelNames[level])
for h in logger.handlers[:]:
logger.removeHandler(h)
logger.propagate = propagate
logger.disabled = 0
hlist = cp.get(sectname, "handlers")
if len(hlist):
hlist = string.split(hlist, ",")
for hand in hlist:
logger.addHandler(handlers[hand])
#Disable any old loggers. There's no point deleting
#them as other threads may continue to hold references
#and by disabling them, you stop them doing any logging.
for log in existing:
root.manager.loggerDict[log].disabled = 1
except:
import traceback
ei = sys.exc_info()
traceback.print_exception(ei[0], ei[1], ei[2], None, sys.stderr)
del ei
finally:
logging._releaseLock()
def listen(port=DEFAULT_LOGGING_CONFIG_PORT):
"""
Start up a socket server on the specified port, and listen for new
configurations.
These will be sent as a file suitable for processing by fileConfig().
Returns a Thread object on which you can call start() to start the server,
and which you can join() when appropriate. To stop the server, call
stopListening().
"""
if not thread:
raise NotImplementedError, "listen() needs threading to work"
class ConfigStreamHandler(StreamRequestHandler):
"""
Handler for a logging configuration request.
It expects a completely new logging configuration and uses fileConfig
to install it.
"""
def handle(self):
"""
Handle a request.
Each request is expected to be a 4-byte length,
followed by the config file. Uses fileConfig() to do the
grunt work.
"""
import tempfile
try:
conn = self.connection
chunk = conn.recv(4)
if len(chunk) == 4:
slen = struct.unpack(">L", chunk)[0]
chunk = self.connection.recv(slen)
while len(chunk) < slen:
chunk = chunk + conn.recv(slen - len(chunk))
#Apply new configuration. We'd like to be able to
#create a StringIO and pass that in, but unfortunately
#1.5.2 ConfigParser does not support reading file
#objects, only actual files. So we create a temporary
#file and remove it later.
file = tempfile.mktemp(".ini")
f = open(file, "w")
f.write(chunk)
f.close()
fileConfig(file)
os.remove(file)
except socket.error, e:
if type(e.args) != types.TupleType:
raise
else:
errcode = e.args[0]
if errcode != RESET_ERROR:
raise
class ConfigSocketReceiver(ThreadingTCPServer):
"""
A simple TCP socket-based logging config receiver.
"""
allow_reuse_address = 1
def __init__(self, host='localhost', port=DEFAULT_LOGGING_CONFIG_PORT,
handler=None):
ThreadingTCPServer.__init__(self, (host, port), handler)
logging._acquireLock()
self.abort = 0
logging._releaseLock()
self.timeout = 1
def serve_until_stopped(self):
import select
abort = 0
while not abort:
rd, wr, ex = select.select([self.socket.fileno()],
[], [],
self.timeout)
if rd:
self.handle_request()
logging._acquireLock()
abort = self.abort
logging._releaseLock()
def serve(rcvr, hdlr, port):
server = rcvr(port=port, handler=hdlr)
global _listener
logging._acquireLock()
_listener = server
logging._releaseLock()
server.serve_until_stopped()
return threading.Thread(target=serve,
args=(ConfigSocketReceiver,
ConfigStreamHandler, port))
def stopListening():
"""
Stop the listening server which was created with a call to listen().
"""
global _listener
if _listener:
logging._acquireLock()
_listener.abort = 1
_listener = None
logging._releaseLock()

View File

@ -0,0 +1,728 @@
# Copyright 2001-2002 by Vinay Sajip. All Rights Reserved.
#
# Permission to use, copy, modify, and distribute this software and its
# documentation for any purpose and without fee is hereby granted,
# provided that the above copyright notice appear in all copies and that
# both that copyright notice and this permission notice appear in
# supporting documentation, and that the name of Vinay Sajip
# not be used in advertising or publicity pertaining to distribution
# of the software without specific, written prior permission.
# VINAY SAJIP DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING
# ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL
# VINAY SAJIP BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR
# ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
# IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
# OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
"""
Logging package for Python. Based on PEP 282 and comments thereto in
comp.lang.python, and influenced by Apache's log4j system.
Should work under Python versions >= 1.5.2, except that source line
information is not available unless 'inspect' is.
Copyright (C) 2001-2002 Vinay Sajip. All Rights Reserved.
To use, simply 'import logging' and log away!
"""
import sys, logging, socket, types, os, string, cPickle, struct, time
from SocketServer import ThreadingTCPServer, StreamRequestHandler
#
# Some constants...
#
DEFAULT_TCP_LOGGING_PORT = 9020
DEFAULT_UDP_LOGGING_PORT = 9021
DEFAULT_HTTP_LOGGING_PORT = 9022
DEFAULT_SOAP_LOGGING_PORT = 9023
SYSLOG_UDP_PORT = 514
class RotatingFileHandler(logging.FileHandler):
def __init__(self, filename, mode="a", maxBytes=0, backupCount=0):
"""
Open the specified file and use it as the stream for logging.
By default, the file grows indefinitely. You can specify particular
values of maxBytes and backupCount to allow the file to rollover at
a predetermined size.
Rollover occurs whenever the current log file is nearly maxBytes in
length. If backupCount is >= 1, the system will successively create
new files with the same pathname as the base file, but with extensions
".1", ".2" etc. appended to it. For example, with a backupCount of 5
and a base file name of "app.log", you would get "app.log",
"app.log.1", "app.log.2", ... through to "app.log.5". The file being
written to is always "app.log" - when it gets filled up, it is closed
and renamed to "app.log.1", and if files "app.log.1", "app.log.2" etc.
exist, then they are renamed to "app.log.2", "app.log.3" etc.
respectively.
If maxBytes is zero, rollover never occurs.
"""
logging.FileHandler.__init__(self, filename, mode)
self.maxBytes = maxBytes
self.backupCount = backupCount
if maxBytes > 0:
self.mode = "a"
def doRollover(self):
"""
Do a rollover, as described in __init__().
"""
self.stream.close()
if self.backupCount > 0:
for i in range(self.backupCount - 1, 0, -1):
sfn = "%s.%d" % (self.baseFilename, i)
dfn = "%s.%d" % (self.baseFilename, i + 1)
if os.path.exists(sfn):
#print "%s -> %s" % (sfn, dfn)
if os.path.exists(dfn):
os.remove(dfn)
os.rename(sfn, dfn)
dfn = self.baseFilename + ".1"
if os.path.exists(dfn):
os.remove(dfn)
os.rename(self.baseFilename, dfn)
#print "%s -> %s" % (self.baseFilename, dfn)
self.stream = open(self.baseFilename, "w")
def emit(self, record):
"""
Emit a record.
Output the record to the file, catering for rollover as described
in doRollover().
"""
if self.maxBytes > 0: # are we rolling over?
msg = "%s\n" % self.format(record)
self.stream.seek(0, 2) #due to non-posix-compliant Windows feature
if self.stream.tell() + len(msg) >= self.maxBytes:
self.doRollover()
logging.FileHandler.emit(self, record)
class SocketHandler(logging.Handler):
"""
A handler class which writes logging records, in pickle format, to
a streaming socket. The socket is kept open across logging calls.
If the peer resets it, an attempt is made to reconnect on the next call.
The pickle which is sent is that of the LogRecord's attribute dictionary
(__dict__), so that the receiver does not need to have the logging module
installed in order to process the logging event.
To unpickle the record at the receiving end into a LogRecord, use the
makeLogRecord function.
"""
def __init__(self, host, port):
"""
Initializes the handler with a specific host address and port.
The attribute 'closeOnError' is set to 1 - which means that if
a socket error occurs, the socket is silently closed and then
reopened on the next logging call.
"""
logging.Handler.__init__(self)
self.host = host
self.port = port
self.sock = None
self.closeOnError = 0
def makeSocket(self):
"""
A factory method which allows subclasses to define the precise
type of socket they want.
"""
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((self.host, self.port))
return s
def send(self, s):
"""
Send a pickled string to the socket.
This function allows for partial sends which can happen when the
network is busy.
"""
if hasattr(self.sock, "sendall"):
self.sock.sendall(s)
else:
sentsofar = 0
left = len(s)
while left > 0:
sent = self.sock.send(s[sentsofar:])
sentsofar = sentsofar + sent
left = left - sent
def makePickle(self, record):
"""
Pickles the record in binary format with a length prefix, and
returns it ready for transmission across the socket.
"""
s = cPickle.dumps(record.__dict__, 1)
#n = len(s)
#slen = "%c%c" % ((n >> 8) & 0xFF, n & 0xFF)
slen = struct.pack(">L", len(s))
return slen + s
def handleError(self, record):
"""
Handle an error during logging.
An error has occurred during logging. Most likely cause -
connection lost. Close the socket so that we can retry on the
next event.
"""
if self.closeOnError and self.sock:
self.sock.close()
self.sock = None #try to reconnect next time
else:
logging.Handler.handleError(self, record)
def emit(self, record):
"""
Emit a record.
Pickles the record and writes it to the socket in binary format.
If there is an error with the socket, silently drop the packet.
If there was a problem with the socket, re-establishes the
socket.
"""
try:
s = self.makePickle(record)
if not self.sock:
self.sock = self.makeSocket()
self.send(s)
except:
self.handleError(record)
def close(self):
"""
Closes the socket.
"""
if self.sock:
self.sock.close()
self.sock = None
class DatagramHandler(SocketHandler):
"""
A handler class which writes logging records, in pickle format, to
a datagram socket. The pickle which is sent is that of the LogRecord's
attribute dictionary (__dict__), so that the receiver does not need to
have the logging module installed in order to process the logging event.
To unpickle the record at the receiving end into a LogRecord, use the
makeLogRecord function.
"""
def __init__(self, host, port):
"""
Initializes the handler with a specific host address and port.
"""
SocketHandler.__init__(self, host, port)
self.closeOnError = 0
def makeSocket(self):
"""
The factory method of SocketHandler is here overridden to create
a UDP socket (SOCK_DGRAM).
"""
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
return s
def send(self, s):
"""
Send a pickled string to a socket.
This function no longer allows for partial sends which can happen
when the network is busy - UDP does not guarantee delivery and
can deliver packets out of sequence.
"""
self.sock.sendto(s, (self.host, self.port))
class SysLogHandler(logging.Handler):
"""
A handler class which sends formatted logging records to a syslog
server. Based on Sam Rushing's syslog module:
http://www.nightmare.com/squirl/python-ext/misc/syslog.py
Contributed by Nicolas Untz (after which minor refactoring changes
have been made).
"""
# from <linux/sys/syslog.h>:
# ======================================================================
# priorities/facilities are encoded into a single 32-bit quantity, where
# the bottom 3 bits are the priority (0-7) and the top 28 bits are the
# facility (0-big number). Both the priorities and the facilities map
# roughly one-to-one to strings in the syslogd(8) source code. This
# mapping is included in this file.
#
# priorities (these are ordered)
LOG_EMERG = 0 # system is unusable
LOG_ALERT = 1 # action must be taken immediately
LOG_CRIT = 2 # critical conditions
LOG_ERR = 3 # error conditions
LOG_WARNING = 4 # warning conditions
LOG_NOTICE = 5 # normal but significant condition
LOG_INFO = 6 # informational
LOG_DEBUG = 7 # debug-level messages
# facility codes
LOG_KERN = 0 # kernel messages
LOG_USER = 1 # random user-level messages
LOG_MAIL = 2 # mail system
LOG_DAEMON = 3 # system daemons
LOG_AUTH = 4 # security/authorization messages
LOG_SYSLOG = 5 # messages generated internally by syslogd
LOG_LPR = 6 # line printer subsystem
LOG_NEWS = 7 # network news subsystem
LOG_UUCP = 8 # UUCP subsystem
LOG_CRON = 9 # clock daemon
LOG_AUTHPRIV = 10 # security/authorization messages (private)
# other codes through 15 reserved for system use
LOG_LOCAL0 = 16 # reserved for local use
LOG_LOCAL1 = 17 # reserved for local use
LOG_LOCAL2 = 18 # reserved for local use
LOG_LOCAL3 = 19 # reserved for local use
LOG_LOCAL4 = 20 # reserved for local use
LOG_LOCAL5 = 21 # reserved for local use
LOG_LOCAL6 = 22 # reserved for local use
LOG_LOCAL7 = 23 # reserved for local use
priority_names = {
"alert": LOG_ALERT,
"crit": LOG_CRIT,
"critical": LOG_CRIT,
"debug": LOG_DEBUG,
"emerg": LOG_EMERG,
"err": LOG_ERR,
"error": LOG_ERR, # DEPRECATED
"info": LOG_INFO,
"notice": LOG_NOTICE,
"panic": LOG_EMERG, # DEPRECATED
"warn": LOG_WARNING, # DEPRECATED
"warning": LOG_WARNING,
}
facility_names = {
"auth": LOG_AUTH,
"authpriv": LOG_AUTHPRIV,
"cron": LOG_CRON,
"daemon": LOG_DAEMON,
"kern": LOG_KERN,
"lpr": LOG_LPR,
"mail": LOG_MAIL,
"news": LOG_NEWS,
"security": LOG_AUTH, # DEPRECATED
"syslog": LOG_SYSLOG,
"user": LOG_USER,
"uucp": LOG_UUCP,
"local0": LOG_LOCAL0,
"local1": LOG_LOCAL1,
"local2": LOG_LOCAL2,
"local3": LOG_LOCAL3,
"local4": LOG_LOCAL4,
"local5": LOG_LOCAL5,
"local6": LOG_LOCAL6,
"local7": LOG_LOCAL7,
}
def __init__(self, address=('localhost', SYSLOG_UDP_PORT), facility=LOG_USER):
"""
Initialize a handler.
If address is specified as a string, UNIX socket is used.
If facility is not specified, LOG_USER is used.
"""
logging.Handler.__init__(self)
self.address = address
self.facility = facility
if type(address) == types.StringType:
self.socket = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
# syslog may require either DGRAM or STREAM sockets
try:
self.socket.connect(address)
except socket.error:
self.socket.close()
self.socket = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
self.socket.connect(address)
self.unixsocket = 1
else:
self.socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
self.unixsocket = 0
self.formatter = None
# curious: when talking to the unix-domain '/dev/log' socket, a
# zero-terminator seems to be required. this string is placed
# into a class variable so that it can be overridden if
# necessary.
log_format_string = '<%d>%s\000'
def encodePriority (self, facility, priority):
"""
Encode the facility and priority. You can pass in strings or
integers - if strings are passed, the facility_names and
priority_names mapping dictionaries are used to convert them to
integers.
"""
if type(facility) == types.StringType:
facility = self.facility_names[facility]
if type(priority) == types.StringType:
priority = self.priority_names[priority]
return (facility << 3) | priority
def close (self):
"""
Closes the socket.
"""
if self.unixsocket:
self.socket.close()
def emit(self, record):
"""
Emit a record.
The record is formatted, and then sent to the syslog server. If
exception information is present, it is NOT sent to the server.
"""
msg = self.format(record)
"""
We need to convert record level to lowercase, maybe this will
change in the future.
"""
msg = self.log_format_string % (
self.encodePriority(self.facility,
string.lower(record.levelname)),
msg)
try:
if self.unixsocket:
self.socket.send(msg)
else:
self.socket.sendto(msg, self.address)
except:
self.handleError(record)
class SMTPHandler(logging.Handler):
"""
A handler class which sends an SMTP email for each logging event.
"""
def __init__(self, mailhost, fromaddr, toaddrs, subject):
"""
Initialize the handler.
Initialize the instance with the from and to addresses and subject
line of the email. To specify a non-standard SMTP port, use the
(host, port) tuple format for the mailhost argument.
"""
logging.Handler.__init__(self)
if type(mailhost) == types.TupleType:
host, port = mailhost
self.mailhost = host
self.mailport = port
else:
self.mailhost = mailhost
self.mailport = None
self.fromaddr = fromaddr
if type(toaddrs) == types.StringType:
toaddrs = [toaddrs]
self.toaddrs = toaddrs
self.subject = subject
def getSubject(self, record):
"""
Determine the subject for the email.
If you want to specify a subject line which is record-dependent,
override this method.
"""
return self.subject
weekdayname = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']
monthname = [None,
'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
def date_time(self):
"""Return the current date and time formatted for a MIME header."""
year, month, day, hh, mm, ss, wd, y, z = time.gmtime(time.time())
s = "%s, %02d %3s %4d %02d:%02d:%02d GMT" % (
self.weekdayname[wd],
day, self.monthname[month], year,
hh, mm, ss)
return s
def emit(self, record):
"""
Emit a record.
Format the record and send it to the specified addressees.
"""
try:
import smtplib
port = self.mailport
if not port:
port = smtplib.SMTP_PORT
smtp = smtplib.SMTP(self.mailhost, port)
msg = self.format(record)
msg = "From: %s\r\nTo: %s\r\nSubject: %s\r\nDate: %s\r\n\r\n%s" % (
self.fromaddr,
string.join(self.toaddrs, ","),
self.getSubject(record),
self.date_time(), msg)
smtp.sendmail(self.fromaddr, self.toaddrs, msg)
smtp.quit()
except:
self.handleError(record)
class NTEventLogHandler(logging.Handler):
"""
A handler class which sends events to the NT Event Log. Adds a
registry entry for the specified application name. If no dllname is
provided, win32service.pyd (which contains some basic message
placeholders) is used. Note that use of these placeholders will make
your event logs big, as the entire message source is held in the log.
If you want slimmer logs, you have to pass in the name of your own DLL
which contains the message definitions you want to use in the event log.
"""
def __init__(self, appname, dllname=None, logtype="Application"):
logging.Handler.__init__(self)
try:
import win32evtlogutil, win32evtlog
self.appname = appname
self._welu = win32evtlogutil
if not dllname:
dllname = os.path.split(self._welu.__file__)
dllname = os.path.split(dllname[0])
dllname = os.path.join(dllname[0], r'win32service.pyd')
self.dllname = dllname
self.logtype = logtype
self._welu.AddSourceToRegistry(appname, dllname, logtype)
self.deftype = win32evtlog.EVENTLOG_ERROR_TYPE
self.typemap = {
logging.DEBUG : win32evtlog.EVENTLOG_INFORMATION_TYPE,
logging.INFO : win32evtlog.EVENTLOG_INFORMATION_TYPE,
logging.WARNING : win32evtlog.EVENTLOG_WARNING_TYPE,
logging.ERROR : win32evtlog.EVENTLOG_ERROR_TYPE,
logging.CRITICAL: win32evtlog.EVENTLOG_ERROR_TYPE,
}
except ImportError:
print "The Python Win32 extensions for NT (service, event "\
"logging) appear not to be available."
self._welu = None
def getMessageID(self, record):
"""
Return the message ID for the event record. If you are using your
own messages, you could do this by having the msg passed to the
logger being an ID rather than a formatting string. Then, in here,
you could use a dictionary lookup to get the message ID. This
version returns 1, which is the base message ID in win32service.pyd.
"""
return 1
def getEventCategory(self, record):
"""
Return the event category for the record.
Override this if you want to specify your own categories. This version
returns 0.
"""
return 0
def getEventType(self, record):
"""
Return the event type for the record.
Override this if you want to specify your own types. This version does
a mapping using the handler's typemap attribute, which is set up in
__init__() to a dictionary which contains mappings for DEBUG, INFO,
WARNING, ERROR and CRITICAL. If you are using your own levels you will
either need to override this method or place a suitable dictionary in
the handler's typemap attribute.
"""
return self.typemap.get(record.levelno, self.deftype)
def emit(self, record):
"""
Emit a record.
Determine the message ID, event category and event type. Then
log the message in the NT event log.
"""
if self._welu:
try:
id = self.getMessageID(record)
cat = self.getEventCategory(record)
type = self.getEventType(record)
msg = self.format(record)
self._welu.ReportEvent(self.appname, id, cat, type, [msg])
except:
self.handleError(record)
def close(self):
"""
Clean up this handler.
You can remove the application name from the registry as a
source of event log entries. However, if you do this, you will
not be able to see the events as you intended in the Event Log
Viewer - it needs to be able to access the registry to get the
DLL name.
"""
#self._welu.RemoveSourceFromRegistry(self.appname, self.logtype)
pass
class HTTPHandler(logging.Handler):
"""
A class which sends records to a Web server, using either GET or
POST semantics.
"""
def __init__(self, host, url, method="GET"):
"""
Initialize the instance with the host, the request URL, and the method
("GET" or "POST")
"""
logging.Handler.__init__(self)
method = string.upper(method)
if method not in ["GET", "POST"]:
raise ValueError, "method must be GET or POST"
self.host = host
self.url = url
self.method = method
def mapLogRecord(self, record):
"""
Default implementation of mapping the log record into a dict
that is send as the CGI data. Overwrite in your class.
Contributed by Franz Glasner.
"""
return record.__dict__
def emit(self, record):
"""
Emit a record.
Send the record to the Web server as an URL-encoded dictionary
"""
try:
import httplib, urllib
h = httplib.HTTP(self.host)
url = self.url
data = urllib.urlencode(self.mapLogRecord(record))
if self.method == "GET":
if (string.find(url, '?') >= 0):
sep = '&'
else:
sep = '?'
url = url + "%c%s" % (sep, data)
h.putrequest(self.method, url)
if self.method == "POST":
h.putheader("Content-length", str(len(data)))
h.endheaders()
if self.method == "POST":
h.send(data)
h.getreply() #can't do anything with the result
except:
self.handleError(record)
class BufferingHandler(logging.Handler):
"""
A handler class which buffers logging records in memory. Whenever each
record is added to the buffer, a check is made to see if the buffer should
be flushed. If it should, then flush() is expected to do what's needed.
"""
def __init__(self, capacity):
"""
Initialize the handler with the buffer size.
"""
logging.Handler.__init__(self)
self.capacity = capacity
self.buffer = []
def shouldFlush(self, record):
"""
Should the handler flush its buffer?
Returns true if the buffer is up to capacity. This method can be
overridden to implement custom flushing strategies.
"""
return (len(self.buffer) >= self.capacity)
def emit(self, record):
"""
Emit a record.
Append the record. If shouldFlush() tells us to, call flush() to process
the buffer.
"""
self.buffer.append(record)
if self.shouldFlush(record):
self.flush()
def flush(self):
"""
Override to implement custom flushing behaviour.
This version just zaps the buffer to empty.
"""
self.buffer = []
class MemoryHandler(BufferingHandler):
"""
A handler class which buffers logging records in memory, periodically
flushing them to a target handler. Flushing occurs whenever the buffer
is full, or when an event of a certain severity or greater is seen.
"""
def __init__(self, capacity, flushLevel=logging.ERROR, target=None):
"""
Initialize the handler with the buffer size, the level at which
flushing should occur and an optional target.
Note that without a target being set either here or via setTarget(),
a MemoryHandler is no use to anyone!
"""
BufferingHandler.__init__(self, capacity)
self.flushLevel = flushLevel
self.target = target
def shouldFlush(self, record):
"""
Check for buffer full or a record at the flushLevel or higher.
"""
return (len(self.buffer) >= self.capacity) or \
(record.levelno >= self.flushLevel)
def setTarget(self, target):
"""
Set the target handler for this handler.
"""
self.target = target
def flush(self):
"""
For a MemoryHandler, flushing means just sending the buffered
records to the target, if there is one. Override if you want
different behaviour.
"""
if self.target:
for record in self.buffer:
self.target.handle(record)
self.buffer = []
def close(self):
"""
Flush, set the target to None and lose the buffer.
"""
self.flush()
self.target = None
self.buffer = []

2931
planet/feedparser.py Normal file

File diff suppressed because it is too large Load Diff

1480
planet/htmltmpl.py Normal file

File diff suppressed because it is too large Load Diff

18624
planet/log/planet.log Normal file

File diff suppressed because it is too large Load Diff

BIN
planet/log/planet.log.1.gz Normal file

Binary file not shown.

BIN
planet/log/planet.log.2.gz Normal file

Binary file not shown.

BIN
planet/log/planet.log.3.gz Normal file

Binary file not shown.

BIN
planet/log/planet.log.4.gz Normal file

Binary file not shown.

354
planet/sanitize.py Normal file
View File

@ -0,0 +1,354 @@
"""
sanitize: bringing sanitiy to world of messed-up data
"""
__author__ = ["Mark Pilgrim <http://diveintomark.org/>",
"Aaron Swartz <http://www.aaronsw.com/>"]
__contributors__ = ["Sam Ruby <http://intertwingly.net/>"]
__license__ = "BSD"
__version__ = "0.25"
_debug = 0
# If you want sanitize to automatically run HTML markup through HTML Tidy, set
# this to 1. Requires mxTidy <http://www.egenix.com/files/python/mxTidy.html>
# or utidylib <http://utidylib.berlios.de/>.
TIDY_MARKUP = 0
# List of Python interfaces for HTML Tidy, in order of preference. Only useful
# if TIDY_MARKUP = 1
PREFERRED_TIDY_INTERFACES = ["uTidy", "mxTidy"]
import sgmllib, re
# chardet library auto-detects character encodings
# Download from http://chardet.feedparser.org/
try:
import chardet
if _debug:
import chardet.constants
chardet.constants._debug = 1
_chardet = lambda data: chardet.detect(data)['encoding']
except:
chardet = None
_chardet = lambda data: None
class _BaseHTMLProcessor(sgmllib.SGMLParser):
elements_no_end_tag = ['area', 'base', 'basefont', 'br', 'col', 'frame', 'hr',
'img', 'input', 'isindex', 'link', 'meta', 'param']
_r_barebang = re.compile(r'<!((?!DOCTYPE|--|\[))', re.IGNORECASE)
_r_bareamp = re.compile("&(?!#\d+;|#x[0-9a-fA-F]+;|\w+;)")
_r_shorttag = re.compile(r'<([^<\s]+?)\s*/>')
def __init__(self, encoding):
self.encoding = encoding
if _debug: sys.stderr.write('entering BaseHTMLProcessor, encoding=%s\n' % self.encoding)
sgmllib.SGMLParser.__init__(self)
def reset(self):
self.pieces = []
sgmllib.SGMLParser.reset(self)
def _shorttag_replace(self, match):
tag = match.group(1)
if tag in self.elements_no_end_tag:
return '<' + tag + ' />'
else:
return '<' + tag + '></' + tag + '>'
def feed(self, data):
data = self._r_barebang.sub(r'&lt;!\1', data)
data = self._r_bareamp.sub("&amp;", data)
data = self._r_shorttag.sub(self._shorttag_replace, data)
if self.encoding and type(data) == type(u''):
data = data.encode(self.encoding)
sgmllib.SGMLParser.feed(self, data)
def normalize_attrs(self, attrs):
# utility method to be called by descendants
attrs = [(k.lower(), v) for k, v in attrs]
attrs = [(k, k in ('rel', 'type') and v.lower() or v) for k, v in attrs]
return attrs
def unknown_starttag(self, tag, attrs):
# called for each start tag
# attrs is a list of (attr, value) tuples
# e.g. for <pre class='screen'>, tag='pre', attrs=[('class', 'screen')]
if _debug: sys.stderr.write('_BaseHTMLProcessor, unknown_starttag, tag=%s\n' % tag)
uattrs = []
# thanks to Kevin Marks for this breathtaking hack to deal with (valid) high-bit attribute values in UTF-8 feeds
for key, value in attrs:
if type(value) != type(u''):
value = unicode(value, self.encoding)
uattrs.append((unicode(key, self.encoding), value))
strattrs = u''.join([u' %s="%s"' % (key, value) for key, value in uattrs]).encode(self.encoding)
if tag in self.elements_no_end_tag:
self.pieces.append('<%(tag)s%(strattrs)s />' % locals())
else:
self.pieces.append('<%(tag)s%(strattrs)s>' % locals())
def unknown_endtag(self, tag):
# called for each end tag, e.g. for </pre>, tag will be 'pre'
# Reconstruct the original end tag.
if tag not in self.elements_no_end_tag:
self.pieces.append("</%(tag)s>" % locals())
def handle_charref(self, ref):
# called for each character reference, e.g. for '&#160;', ref will be '160'
# Reconstruct the original character reference.
self.pieces.append('&#%(ref)s;' % locals())
def handle_entityref(self, ref):
# called for each entity reference, e.g. for '&copy;', ref will be 'copy'
# Reconstruct the original entity reference.
self.pieces.append('&%(ref)s;' % locals())
def handle_data(self, text):
# called for each block of plain text, i.e. outside of any tag and
# not containing any character or entity references
# Store the original text verbatim.
if _debug: sys.stderr.write('_BaseHTMLProcessor, handle_text, text=%s\n' % text)
self.pieces.append(text)
def handle_comment(self, text):
# called for each HTML comment, e.g. <!-- insert Javascript code here -->
# Reconstruct the original comment.
self.pieces.append('<!--%(text)s-->' % locals())
def handle_pi(self, text):
# called for each processing instruction, e.g. <?instruction>
# Reconstruct original processing instruction.
self.pieces.append('<?%(text)s>' % locals())
def handle_decl(self, text):
# called for the DOCTYPE, if present, e.g.
# <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
# "http://www.w3.org/TR/html4/loose.dtd">
# Reconstruct original DOCTYPE
self.pieces.append('<!%(text)s>' % locals())
_new_declname_match = re.compile(r'[a-zA-Z][-_.a-zA-Z0-9:]*\s*').match
def _scan_name(self, i, declstartpos):
rawdata = self.rawdata
n = len(rawdata)
if i == n:
return None, -1
m = self._new_declname_match(rawdata, i)
if m:
s = m.group()
name = s.strip()
if (i + len(s)) == n:
return None, -1 # end of buffer
return name.lower(), m.end()
else:
self.handle_data(rawdata)
# self.updatepos(declstartpos, i)
return None, -1
def output(self):
'''Return processed HTML as a single string'''
return ''.join([str(p) for p in self.pieces])
class _HTMLSanitizer(_BaseHTMLProcessor):
acceptable_elements = ['a', 'abbr', 'acronym', 'address', 'area', 'b', 'big',
'blockquote', 'br', 'button', 'caption', 'center', 'cite', 'code', 'col',
'colgroup', 'dd', 'del', 'dfn', 'dir', 'div', 'dl', 'dt', 'em', 'fieldset',
'font', 'form', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'hr', 'i', 'img', 'input',
'ins', 'kbd', 'label', 'legend', 'li', 'map', 'menu', 'ol', 'optgroup',
'option', 'p', 'pre', 'q', 's', 'samp', 'select', 'small', 'span', 'strike',
'strong', 'sub', 'sup', 'table', 'textarea', 'tbody', 'td', 'tfoot', 'th',
'thead', 'tr', 'tt', 'u', 'ul', 'var']
acceptable_attributes = ['abbr', 'accept', 'accept-charset', 'accesskey',
'action', 'align', 'alt', 'axis', 'border', 'cellpadding', 'cellspacing',
'char', 'charoff', 'charset', 'checked', 'cite', 'class', 'clear', 'cols',
'colspan', 'color', 'compact', 'coords', 'datetime', 'dir', 'disabled',
'enctype', 'for', 'frame', 'headers', 'height', 'href', 'hreflang', 'hspace',
'id', 'ismap', 'label', 'lang', 'longdesc', 'maxlength', 'media', 'method',
'multiple', 'name', 'nohref', 'noshade', 'nowrap', 'prompt', 'readonly',
'rel', 'rev', 'rows', 'rowspan', 'rules', 'scope', 'selected', 'shape', 'size',
'span', 'src', 'start', 'summary', 'tabindex', 'target', 'title', 'type',
'usemap', 'valign', 'value', 'vspace', 'width']
ignorable_elements = ['script', 'applet', 'style']
def reset(self):
_BaseHTMLProcessor.reset(self)
self.tag_stack = []
self.ignore_level = 0
def feed(self, data):
_BaseHTMLProcessor.feed(self, data)
while self.tag_stack:
_BaseHTMLProcessor.unknown_endtag(self, self.tag_stack.pop())
def unknown_starttag(self, tag, attrs):
if tag in self.ignorable_elements:
self.ignore_level += 1
return
if self.ignore_level:
return
if tag in self.acceptable_elements:
attrs = self.normalize_attrs(attrs)
attrs = [(key, value) for key, value in attrs if key in self.acceptable_attributes]
if tag not in self.elements_no_end_tag:
self.tag_stack.append(tag)
_BaseHTMLProcessor.unknown_starttag(self, tag, attrs)
def unknown_endtag(self, tag):
if tag in self.ignorable_elements:
self.ignore_level -= 1
return
if self.ignore_level:
return
if tag in self.acceptable_elements and tag not in self.elements_no_end_tag:
match = False
while self.tag_stack:
top = self.tag_stack.pop()
if top == tag:
match = True
break
_BaseHTMLProcessor.unknown_endtag(self, top)
if match:
_BaseHTMLProcessor.unknown_endtag(self, tag)
def handle_pi(self, text):
pass
def handle_decl(self, text):
pass
def handle_data(self, text):
if not self.ignore_level:
text = text.replace('<', '')
_BaseHTMLProcessor.handle_data(self, text)
def HTML(htmlSource, encoding='utf8'):
p = _HTMLSanitizer(encoding)
p.feed(htmlSource)
data = p.output()
if TIDY_MARKUP:
# loop through list of preferred Tidy interfaces looking for one that's installed,
# then set up a common _tidy function to wrap the interface-specific API.
_tidy = None
for tidy_interface in PREFERRED_TIDY_INTERFACES:
try:
if tidy_interface == "uTidy":
from tidy import parseString as _utidy
def _tidy(data, **kwargs):
return str(_utidy(data, **kwargs))
break
elif tidy_interface == "mxTidy":
from mx.Tidy import Tidy as _mxtidy
def _tidy(data, **kwargs):
nerrors, nwarnings, data, errordata = _mxtidy.tidy(data, **kwargs)
return data
break
except:
pass
if _tidy:
utf8 = type(data) == type(u'')
if utf8:
data = data.encode('utf-8')
data = _tidy(data, output_xhtml=1, numeric_entities=1, wrap=0, char_encoding="utf8")
if utf8:
data = unicode(data, 'utf-8')
if data.count('<body'):
data = data.split('<body', 1)[1]
if data.count('>'):
data = data.split('>', 1)[1]
if data.count('</body'):
data = data.split('</body', 1)[0]
data = data.strip().replace('\r\n', '\n')
return data
unicode_bom_map = {
'\x00\x00\xfe\xff': 'utf-32be',
'\xff\xfe\x00\x00': 'utf-32le',
'\xfe\xff##': 'utf-16be',
'\xff\xfe##': 'utf-16le',
'\xef\bb\bf': 'utf-8'
}
xml_bom_map = {
'\x00\x00\x00\x3c': 'utf-32be',
'\x3c\x00\x00\x00': 'utf-32le',
'\x00\x3c\x00\x3f': 'utf-16be',
'\x3c\x00\x3f\x00': 'utf-16le',
'\x3c\x3f\x78\x6d': 'utf-8', # or equivalent
'\x4c\x6f\xa7\x94': 'ebcdic'
}
_ebcdic_to_ascii_map = None
def _ebcdic_to_ascii(s):
global _ebcdic_to_ascii_map
if not _ebcdic_to_ascii_map:
emap = (
0,1,2,3,156,9,134,127,151,141,142,11,12,13,14,15,
16,17,18,19,157,133,8,135,24,25,146,143,28,29,30,31,
128,129,130,131,132,10,23,27,136,137,138,139,140,5,6,7,
144,145,22,147,148,149,150,4,152,153,154,155,20,21,158,26,
32,160,161,162,163,164,165,166,167,168,91,46,60,40,43,33,
38,169,170,171,172,173,174,175,176,177,93,36,42,41,59,94,
45,47,178,179,180,181,182,183,184,185,124,44,37,95,62,63,
186,187,188,189,190,191,192,193,194,96,58,35,64,39,61,34,
195,97,98,99,100,101,102,103,104,105,196,197,198,199,200,201,
202,106,107,108,109,110,111,112,113,114,203,204,205,206,207,208,
209,126,115,116,117,118,119,120,121,122,210,211,212,213,214,215,
216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,
123,65,66,67,68,69,70,71,72,73,232,233,234,235,236,237,
125,74,75,76,77,78,79,80,81,82,238,239,240,241,242,243,
92,159,83,84,85,86,87,88,89,90,244,245,246,247,248,249,
48,49,50,51,52,53,54,55,56,57,250,251,252,253,254,255
)
import string
_ebcdic_to_ascii_map = string.maketrans( \
''.join(map(chr, range(256))), ''.join(map(chr, emap)))
return s.translate(_ebcdic_to_ascii_map)
def _startswithbom(text, bom):
for i, c in enumerate(bom):
if c == '#':
if text[i] == '\x00':
return False
else:
if text[i] != c:
return False
return True
def _detectbom(text, bom_map=unicode_bom_map):
for bom, encoding in bom_map.iteritems():
if _startswithbom(text, bom):
return encoding
return None
def characters(text, isXML=False, guess=None):
"""
Takes a string text of unknown encoding and tries to
provide a Unicode string for it.
"""
_triedEncodings = []
def tryEncoding(encoding):
if encoding and encoding not in _triedEncodings:
if encoding == 'ebcdic':
return _ebcdic_to_ascii(text)
try:
return unicode(text, encoding)
except UnicodeDecodeError:
pass
_triedEncodings.append(encoding)
return (
tryEncoding(guess) or
tryEncoding(_detectbom(text)) or
isXML and tryEncoding(_detectbom(text, xml_bom_map)) or
tryEncoding(_chardet(text)) or
tryEncoding('utf8') or
tryEncoding('windows-1252') or
tryEncoding('iso-8859-1'))

0
planet/tests/__init__.py Normal file
View File

View File

@ -0,0 +1,38 @@
#!/usr/bin/env python
import unittest
import planet
import tempfile
import ConfigParser
class FakePlanet:
"""
A dummy Planet object that's enough to fool the
Channel.__init__ method
"""
def __init__(self):
self.cache_directory = tempfile.gettempdir()
self.config = ConfigParser.ConfigParser()
class FeedInformationTest(unittest.TestCase):
"""
Test the Channel.feed_information method
"""
def setUp(self):
self.url = 'URL'
self.changed_url = 'Changed URL'
self.channel = planet.Channel(FakePlanet(), self.url)
def test_unchangedurl(self):
self.assertEqual(self.channel.feed_information(), '<%s>' % self.url)
def test_changedurl(self):
# change the URL directly
self.channel.url = self.changed_url
self.assertEqual(self.channel.feed_information(),
"<%s> (formerly <%s>)" % (self.changed_url, self.url))
if __name__ == '__main__':
unittest.main()

71
planet/tests/test_main.py Normal file
View File

@ -0,0 +1,71 @@
#!/usr/bin/env python
import os, sys, shutil, errno, unittest
from ConfigParser import ConfigParser
from StringIO import StringIO
import planet
class MainTest(unittest.TestCase):
def test_minimal(self):
configp = ConfigParser()
my_planet = planet.Planet(configp)
my_planet.run("Planet Name", "http://example.com", [])
def test_onefeed(self):
configp = ConfigParser()
configp.readfp(StringIO("""[http://www.example.com/]
name = Mary
"""))
my_planet = planet.Planet(configp)
my_planet.run("Planet Name", "http://example.com", [], True)
def test_generateall(self):
configp = ConfigParser()
configp.readfp(StringIO("""[http://www.example.com/]
name = Mary
"""))
my_planet = planet.Planet(configp)
my_planet.run("Planet Name", "http://example.com", [], True)
basedir = os.path.join(os.path.dirname(os.path.abspath(sys.modules[__name__].__file__)), 'data')
os.mkdir(self.output_dir)
t_file_names = ['simple', 'simple2']
self._remove_cached_templates(basedir, t_file_names)
t_files = [os.path.join(basedir, t_file) + '.tmpl' for t_file in t_file_names]
my_planet.generate_all_files(t_files, "Planet Name",
'http://example.com/', 'http://example.com/feed/', 'Mary', 'mary@example.com')
for file_name in t_file_names:
name = os.path.join(self.output_dir, file_name)
content = file(name).read()
self.assertEqual(content, 'Mary\n')
def _remove_cached_templates(self, basedir, template_files):
"""
Remove the .tmplc files and force them to be rebuilt.
This is required mainly so that the tests don't fail in mysterious ways in
directories that have been moved, eg 'branches/my-branch' to
'branches/mysterious-branch' -- the .tmplc files seem to remember their full
path
"""
for file in template_files:
path = os.path.join(basedir, file + '.tmplc')
try:
os.remove(path)
except OSError, e:
# we don't care about the file not being there, we care about
# everything else
if e.errno != errno.ENOENT:
raise
def setUp(self):
super(MainTest, self).setUp()
self.output_dir = 'output'
def tearDown(self):
super(MainTest, self).tearDown()
shutil.rmtree(self.output_dir, ignore_errors = True)
shutil.rmtree('cache', ignore_errors = True)
if __name__ == '__main__':
unittest.main()

View File

@ -0,0 +1,125 @@
# adapted from http://www.iamcal.com/publish/articles/php/processing_html_part_2/
# and from http://feedparser.org/tests/wellformed/sanitize/
# by Aaron Swartz, 2006, public domain
import unittest, new
from planet import sanitize
class SanitizeTest(unittest.TestCase): pass
# each call to HTML adds a test case to SanitizeTest
testcases = 0
def HTML(a, b):
global testcases
testcases += 1
func = lambda self: self.assertEqual(sanitize.HTML(a), b)
method = new.instancemethod(func, None, SanitizeTest)
setattr(SanitizeTest, "test_%d" % testcases, method)
## basics
HTML("","")
HTML("hello","hello")
## balancing tags
HTML("<b>hello","<b>hello</b>")
HTML("hello<b>","hello<b></b>")
HTML("hello</b>","hello")
HTML("hello<b/>","hello<b></b>")
HTML("<b><b><b>hello","<b><b><b>hello</b></b></b>")
HTML("</b><b>","<b></b>")
## trailing slashes
HTML('<img>','<img />')
HTML('<img/>','<img />')
HTML('<b/></b>','<b></b>')
## balancing angle brakets
HTML('<img src="foo"','')
HTML('b>','b>')
HTML('<img src="foo"/','')
HTML('>','>')
HTML('foo<b','foo')
HTML('b>foo','b>foo')
HTML('><b','>')
HTML('b><','b>')
HTML('><b>','><b></b>')
## attributes
HTML('<img src=foo>','<img src="foo" />')
HTML('<img asrc=foo>','<img />')
HTML('<img src=test test>','<img src="test" />')
## dangerous tags (a small sample)
sHTML = lambda x: HTML(x, 'safe <b>description</b>')
sHTML('safe<applet code="foo.class" codebase="http://example.com/"></applet> <b>description</b>')
sHTML('<notinventedyet>safe</notinventedyet> <b>description</b>')
sHTML('<blink>safe</blink> <b>description</b>')
sHTML('safe<embed src="http://example.com/"> <b>description</b>')
sHTML('safe<frameset rows="*"><frame src="http://example.com/"></frameset> <b>description</b>')
sHTML('safe<iframe src="http://example.com/"> <b>description</b></iframe>')
sHTML('safe<link rel="stylesheet" type="text/css" href="http://example.com/evil.css"> <b>description</b>')
sHTML('safe<meta http-equiv="Refresh" content="0; URL=http://example.com/"> <b>description</b>')
sHTML('safe<object classid="clsid:C932BA85-4374-101B-A56C-00AA003668DC"> <b>description</b>')
sHTML('safe<script type="text/javascript">location.href=\'http:/\'+\'/example.com/\';</script> <b>description</b>')
for x in ['onabort', 'onblur', 'onchange', 'onclick', 'ondblclick', 'onerror', 'onfocus', 'onkeydown', 'onkeypress', 'onkeyup', 'onload', 'onmousedown', 'onmouseout', 'onmouseover', 'onmouseup', 'onreset', 'resize', 'onsubmit', 'onunload']:
HTML('<img src="http://www.ragingplatypus.com/i/cam-full.jpg" %s="location.href=\'http://www.ragingplatypus.com/\';" />' % x,
'<img src="http://www.ragingplatypus.com/i/cam-full.jpg" />')
HTML('<a href="http://www.ragingplatypus.com/" style="display:block; position:absolute; left:0; top:0; width:100%; height:100%; z-index:1; background-color:black; background-image:url(http://www.ragingplatypus.com/i/cam-full.jpg); background-x:center; background-y:center; background-repeat:repeat;">never trust your upstream platypus</a>', '<a href="http://www.ragingplatypus.com/">never trust your upstream platypus</a>')
## ignorables
HTML('foo<style>bar', 'foo')
HTML('foo<style>bar</style>', 'foo')
## non-allowed tags
HTML('<script>','')
HTML('<script','')
HTML('<script/>','')
HTML('</script>','')
HTML('<script woo=yay>','')
HTML('<script woo="yay">','')
HTML('<script woo="yay>','')
HTML('<script woo="yay<b>','')
HTML('<script<script>>','')
HTML('<<script>script<script>>','')
HTML('<<script><script>>','')
HTML('<<script>script>>','')
HTML('<<script<script>>','')
## bad protocols
HTML('<a href="http://foo">bar</a>', '<a href="http://foo">bar</a>')
HTML('<a href="ftp://foo">bar</a>', '<a href="ftp://foo">bar</a>')
HTML('<a href="mailto:foo">bar</a>', '<a href="mailto:foo">bar</a>')
# not yet supported:
#HTML('<a href="javascript:foo">bar</a>', '<a href="#foo">bar</a>')
#HTML('<a href="java script:foo">bar</a>', '<a href="#foo">bar</a>')
#HTML('<a href="java\tscript:foo">bar</a>', '<a href="#foo">bar</a>')
#HTML('<a href="java\nscript:foo">bar</a>', '<a href="#foo">bar</a>')
#HTML('<a href="java'+chr(1)+'script:foo">bar</a>', '<a href="#foo">bar</a>')
#HTML('<a href="jscript:foo">bar</a>', '<a href="#foo">bar</a>')
#HTML('<a href="vbscript:foo">bar</a>', '<a href="#foo">bar</a>')
#HTML('<a href="view-source:foo">bar</a>', '<a href="#foo">bar</a>')
## auto closers
HTML('<img src="a">', '<img src="a" />')
HTML('<img src="a">foo</img>', '<img src="a" />foo')
HTML('</img>', '')
## crazy: http://alpha-geek.com/example/crazy_html2.html
HTML('<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\r\n\r\n<html xmlns="http://www.w3.org/1999/xhtml">\r\n<head>\r\n<title>Crazy HTML -- Can Your Regex Parse This?</title>\r\n</head>\r\n<body notRealAttribute="value"onload="executeMe();"foo="bar"\r\n\r\n>\r\n<!-- <script> -->\r\n\r\n<!-- \r\n\t<script> \r\n-->\r\n\r\n</script>\r\n\r\n\r\n<script\r\n\r\n\r\n>\r\n\r\nfunction executeMe()\r\n{\r\n\r\n\r\n\r\n\r\n/* <script> \r\nfunction am_i_javascript()\r\n{\r\n\tvar str = "Some innocuously commented out stuff";\r\n}\r\n< /script>\r\n*/\r\n\r\n\t\r\n\t\r\n\t\r\n\t\r\n\t\r\n\t\r\n\t\r\n\t\r\n\talert("Executed");\r\n}\r\n\r\n </script\r\n\r\n\r\n\r\n>\r\n<h1>Did The Javascript Execute?</h1>\r\n<div notRealAttribute="value\r\n"onmouseover="\r\nexecuteMe();\r\n"foo="bar">\r\nI will execute here, too, if you mouse over me\r\n</div>\r\nThis is to keep you guys honest...<br />\r\nI like ontonology. I like to script ontology. Though, script>style>this.\r\n</body>\r\n</html>', 'Crazy HTML -- Can Your Regex Parse This?\n\n\n<!-- <script> -->\n\n<!-- \n\t<script> \n-->\n\n\n\nfunction executeMe()\n{\n\n\n\n\n/* \n<h1>Did The Javascript Execute?</h1>\n<div>\nI will execute here, too, if you mouse over me\n</div>\nThis is to keep you guys honest...<br />\nI like ontonology. I like to script ontology. Though, script>style>this.')
# valid entity references
HTML("&nbsp;","&nbsp;");
HTML("&#160;","&#160;");
HTML("&#xa0;","&#xa0;");
HTML("&#xA0;","&#xA0;");
# unescaped ampersands
HTML("AT&T","AT&amp;T");
HTML("http://example.org?a=1&b=2","http://example.org?a=1&amp;b=2");
# quote characters
HTML('<a title="&#34;">quote</a>','<a title="&#34;">quote</a>')
HTML('<a title="&#39;">quote</a>','<a title="&#39;">quote</a>')

424
planet/timeoutsocket.py Normal file
View File

@ -0,0 +1,424 @@
####
# Copyright 2000,2001 by Timothy O'Malley <timo@alum.mit.edu>
#
# All Rights Reserved
#
# Permission to use, copy, modify, and distribute this software
# and its documentation for any purpose and without fee is hereby
# granted, provided that the above copyright notice appear in all
# copies and that both that copyright notice and this permission
# notice appear in supporting documentation, and that the name of
# Timothy O'Malley not be used in advertising or publicity
# pertaining to distribution of the software without specific, written
# prior permission.
#
# Timothy O'Malley DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS
# SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
# AND FITNESS, IN NO EVENT SHALL Timothy O'Malley BE LIABLE FOR
# ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
# WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
# WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
# ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
# PERFORMANCE OF THIS SOFTWARE.
#
####
"""Timeout Socket
This module enables a timeout mechanism on all TCP connections. It
does this by inserting a shim into the socket module. After this module
has been imported, all socket creation goes through this shim. As a
result, every TCP connection will support a timeout.
The beauty of this method is that it immediately and transparently
enables the entire python library to support timeouts on TCP sockets.
As an example, if you wanted to SMTP connections to have a 20 second
timeout:
import timeoutsocket
import smtplib
timeoutsocket.setDefaultSocketTimeout(20)
The timeout applies to the socket functions that normally block on
execution: read, write, connect, and accept. If any of these
operations exceeds the specified timeout, the exception Timeout
will be raised.
The default timeout value is set to None. As a result, importing
this module does not change the default behavior of a socket. The
timeout mechanism only activates when the timeout has been set to
a numeric value. (This behavior mimics the behavior of the
select.select() function.)
This module implements two classes: TimeoutSocket and TimeoutFile.
The TimeoutSocket class defines a socket-like object that attempts to
avoid the condition where a socket may block indefinitely. The
TimeoutSocket class raises a Timeout exception whenever the
current operation delays too long.
The TimeoutFile class defines a file-like object that uses the TimeoutSocket
class. When the makefile() method of TimeoutSocket is called, it returns
an instance of a TimeoutFile.
Each of these objects adds two methods to manage the timeout value:
get_timeout() --> returns the timeout of the socket or file
set_timeout() --> sets the timeout of the socket or file
As an example, one might use the timeout feature to create httplib
connections that will timeout after 30 seconds:
import timeoutsocket
import httplib
H = httplib.HTTP("www.python.org")
H.sock.set_timeout(30)
Note: When used in this manner, the connect() routine may still
block because it happens before the timeout is set. To avoid
this, use the 'timeoutsocket.setDefaultSocketTimeout()' function.
Good Luck!
"""
__version__ = "$Revision: 1.1.1.1 $"
__author__ = "Timothy O'Malley <timo@alum.mit.edu>"
#
# Imports
#
import select, string
import socket
if not hasattr(socket, "_no_timeoutsocket"):
_socket = socket.socket
else:
_socket = socket._no_timeoutsocket
#
# Set up constants to test for Connected and Blocking operations.
# We delete 'os' and 'errno' to keep our namespace clean(er).
# Thanks to Alex Martelli and G. Li for the Windows error codes.
#
import os
if os.name == "nt":
_IsConnected = ( 10022, 10056 )
_ConnectBusy = ( 10035, )
_AcceptBusy = ( 10035, )
else:
import errno
_IsConnected = ( errno.EISCONN, )
_ConnectBusy = ( errno.EINPROGRESS, errno.EALREADY, errno.EWOULDBLOCK )
_AcceptBusy = ( errno.EAGAIN, errno.EWOULDBLOCK )
del errno
del os
#
# Default timeout value for ALL TimeoutSockets
#
_DefaultTimeout = None
def setDefaultSocketTimeout(timeout):
global _DefaultTimeout
_DefaultTimeout = timeout
def getDefaultSocketTimeout():
return _DefaultTimeout
#
# Exceptions for socket errors and timeouts
#
Error = socket.error
class Timeout(Exception):
pass
#
# Factory function
#
from socket import AF_INET, SOCK_STREAM
def timeoutsocket(family=AF_INET, type=SOCK_STREAM, proto=None):
if family != AF_INET or type != SOCK_STREAM:
if proto:
return _socket(family, type, proto)
else:
return _socket(family, type)
return TimeoutSocket( _socket(family, type), _DefaultTimeout )
# end timeoutsocket
#
# The TimeoutSocket class definition
#
class TimeoutSocket:
"""TimeoutSocket object
Implements a socket-like object that raises Timeout whenever
an operation takes too long.
The definition of 'too long' can be changed using the
set_timeout() method.
"""
_copies = 0
_blocking = 1
def __init__(self, sock, timeout):
self._sock = sock
self._timeout = timeout
# end __init__
def __getattr__(self, key):
return getattr(self._sock, key)
# end __getattr__
def get_timeout(self):
return self._timeout
# end set_timeout
def set_timeout(self, timeout=None):
self._timeout = timeout
# end set_timeout
def setblocking(self, blocking):
self._blocking = blocking
return self._sock.setblocking(blocking)
# end set_timeout
def connect_ex(self, addr):
errcode = 0
try:
self.connect(addr)
except Error, why:
errcode = why[0]
return errcode
# end connect_ex
def connect(self, addr, port=None, dumbhack=None):
# In case we were called as connect(host, port)
if port != None: addr = (addr, port)
# Shortcuts
sock = self._sock
timeout = self._timeout
blocking = self._blocking
# First, make a non-blocking call to connect
try:
sock.setblocking(0)
sock.connect(addr)
sock.setblocking(blocking)
return
except Error, why:
# Set the socket's blocking mode back
sock.setblocking(blocking)
# If we are not blocking, re-raise
if not blocking:
raise
# If we are already connected, then return success.
# If we got a genuine error, re-raise it.
errcode = why[0]
if dumbhack and errcode in _IsConnected:
return
elif errcode not in _ConnectBusy:
raise
# Now, wait for the connect to happen
# ONLY if dumbhack indicates this is pass number one.
# If select raises an error, we pass it on.
# Is this the right behavior?
if not dumbhack:
r,w,e = select.select([], [sock], [], timeout)
if w:
return self.connect(addr, dumbhack=1)
# If we get here, then we should raise Timeout
raise Timeout("Attempted connect to %s timed out." % str(addr) )
# end connect
def accept(self, dumbhack=None):
# Shortcuts
sock = self._sock
timeout = self._timeout
blocking = self._blocking
# First, make a non-blocking call to accept
# If we get a valid result, then convert the
# accept'ed socket into a TimeoutSocket.
# Be carefult about the blocking mode of ourselves.
try:
sock.setblocking(0)
newsock, addr = sock.accept()
sock.setblocking(blocking)
timeoutnewsock = self.__class__(newsock, timeout)
timeoutnewsock.setblocking(blocking)
return (timeoutnewsock, addr)
except Error, why:
# Set the socket's blocking mode back
sock.setblocking(blocking)
# If we are not supposed to block, then re-raise
if not blocking:
raise
# If we got a genuine error, re-raise it.
errcode = why[0]
if errcode not in _AcceptBusy:
raise
# Now, wait for the accept to happen
# ONLY if dumbhack indicates this is pass number one.
# If select raises an error, we pass it on.
# Is this the right behavior?
if not dumbhack:
r,w,e = select.select([sock], [], [], timeout)
if r:
return self.accept(dumbhack=1)
# If we get here, then we should raise Timeout
raise Timeout("Attempted accept timed out.")
# end accept
def send(self, data, flags=0):
sock = self._sock
if self._blocking:
r,w,e = select.select([],[sock],[], self._timeout)
if not w:
raise Timeout("Send timed out")
return sock.send(data, flags)
# end send
def recv(self, bufsize, flags=0):
sock = self._sock
if self._blocking:
r,w,e = select.select([sock], [], [], self._timeout)
if not r:
raise Timeout("Recv timed out")
return sock.recv(bufsize, flags)
# end recv
def makefile(self, flags="r", bufsize=-1):
self._copies = self._copies +1
return TimeoutFile(self, flags, bufsize)
# end makefile
def close(self):
if self._copies <= 0:
self._sock.close()
else:
self._copies = self._copies -1
# end close
# end TimeoutSocket
class TimeoutFile:
"""TimeoutFile object
Implements a file-like object on top of TimeoutSocket.
"""
def __init__(self, sock, mode="r", bufsize=4096):
self._sock = sock
self._bufsize = 4096
if bufsize > 0: self._bufsize = bufsize
if not hasattr(sock, "_inqueue"): self._sock._inqueue = ""
# end __init__
def __getattr__(self, key):
return getattr(self._sock, key)
# end __getattr__
def close(self):
self._sock.close()
self._sock = None
# end close
def write(self, data):
self.send(data)
# end write
def read(self, size=-1):
_sock = self._sock
_bufsize = self._bufsize
while 1:
datalen = len(_sock._inqueue)
if datalen >= size >= 0:
break
bufsize = _bufsize
if size > 0:
bufsize = min(bufsize, size - datalen )
buf = self.recv(bufsize)
if not buf:
break
_sock._inqueue = _sock._inqueue + buf
data = _sock._inqueue
_sock._inqueue = ""
if size > 0 and datalen > size:
_sock._inqueue = data[size:]
data = data[:size]
return data
# end read
def readline(self, size=-1):
_sock = self._sock
_bufsize = self._bufsize
while 1:
idx = string.find(_sock._inqueue, "\n")
if idx >= 0:
break
datalen = len(_sock._inqueue)
if datalen >= size >= 0:
break
bufsize = _bufsize
if size > 0:
bufsize = min(bufsize, size - datalen )
buf = self.recv(bufsize)
if not buf:
break
_sock._inqueue = _sock._inqueue + buf
data = _sock._inqueue
_sock._inqueue = ""
if idx >= 0:
idx = idx + 1
_sock._inqueue = data[idx:]
data = data[:idx]
elif size > 0 and datalen > size:
_sock._inqueue = data[size:]
data = data[:size]
return data
# end readline
def readlines(self, sizehint=-1):
result = []
data = self.read()
while data:
idx = string.find(data, "\n")
if idx >= 0:
idx = idx + 1
result.append( data[:idx] )
data = data[idx:]
else:
result.append( data )
data = ""
return result
# end readlines
def flush(self): pass
# end TimeoutFile
#
# Silently replace the socket() builtin function with
# our timeoutsocket() definition.
#
if not hasattr(socket, "_no_timeoutsocket"):
socket._no_timeoutsocket = socket.socket
socket.socket = timeoutsocket
del socket
socket = timeoutsocket
# Finis

11
runtests.py Normal file
View File

@ -0,0 +1,11 @@
#!/usr/bin/env python
import glob, trace, unittest
# find all of the planet test modules
modules = map(trace.fullmodname, glob.glob('planet/tests/test_*.py'))
# load all of the tests into a suite
suite = unittest.TestLoader().loadTestsFromNames(modules)
# run test suite
unittest.TextTestRunner().run(suite)

22
setup.py Normal file
View File

@ -0,0 +1,22 @@
#!/usr/bin/env python
"""The Planet Feed Aggregator"""
import os
from distutils.core import setup
from planet import __version__ as VERSION
from planet import __license__ as LICENSE
if 'PLANET_VERSION' in os.environ.keys():
VERSION = os.environ['PLANET_VERSION']
setup(name="planet",
version=VERSION,
description="The Planet Feed Aggregator",
author="Planet Developers",
author_email="devel@lists.planetplanet.org",
url="http://www.planetplanet.org/",
license=LICENSE,
packages=["planet", "planet.compat_logging", "planet.tests"],
scripts=["planet.py", "planet-cache.py", "runtests.py"],
)

55
www/bloggers.css Normal file
View File

@ -0,0 +1,55 @@
#bloggers {
/* position: absolute; */
top: 115px;
right: 15px;
width: 230px;
}
#bloggers h2 {
margin-left: 0;
font-size: 12px;
}
#bloggers ul {
padding:0;
margin: 0 0 1.5em 0;
list-style-type:none;
}
#bloggers ul li {
padding: 1px;
}
#bloggers ul li div img {
}
#bloggers ul li div {
display: none;
}
#bloggers ul li:hover > a {
font-weight: bold;
}
#bloggers ul li div img.head {
float: right;
padding: 0px;
}
#bloggers ul li:hover > div {
display: inline;
}
#bloggers ul li:hover {
padding: 0 0 10px 0;
background-color: #cfcfcf;
}
#bloggers .ircnick {
display: block;
color: #000000;
font-style: italic;
padding: 2px;
}
#bloggers a:visited {
color: #5a7ac7 !important;
}

952
www/foafroll.xml Normal file
View File

@ -0,0 +1,952 @@
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:rss="http://purl.org/rss/1.0/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
>
<foaf:Group>
<foaf:name>Linux Gezegeni</foaf:name>
<foaf:homepage>http://gezegen.linux.org.tr</foaf:homepage>
<rdfs:seeAlso rdf:resource="" />
<foaf:member>
<foaf:Agent>
<foaf:name>A. Murat Eren</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="">
<dc:title></dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Adem Alp Yıldız</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.ademalpyildiz.com.tr">
<dc:title>Adem Alp YILDIZ</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Ahmet Aygün</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://ahmet.pardusman.org/blog">
<dc:title>~/blog</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Ahmet Yıldız</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.bugunlinux.com">
<dc:title>Bugün Linux</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Ali Erdinç Köroğlu</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.erdinc.info">
<dc:title>The Point of no return » LKD</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Ali Erkan İMREK</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://armuting.blogspot.com/search/label/%C3%B6i_gezegen">
<dc:title>armut</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Alper Kanat</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://raptiye.org/blog">
<dc:title>raptiye</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Alper Orus</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.murekkep.org">
<dc:title>Mürekkep - İnternet Yaşam Rehberiniz » admin</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Alper Somuncu</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.alpersomuncu.com/weblog/">
<dc:title>alper somuncu nokta com - IBM AIX</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Arman Aksoy</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://armish.linux-sevenler.org/blog">
<dc:title>Expressed Exons » Gezegen</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Bahri Meriç Canlı</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.bahri.info">
<dc:title>Bahri Meriç CANLI Kişisel Web Sitesi » Linux</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Barış Metin</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="">
<dc:title></dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Barış Özyurt</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.tuxworkshop.com/blog">
<dc:title>TuxWorkshop</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Bora Güngören</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://blogs.portakalteknoloji.com/bora/blog/">
<dc:title>Bora Güngören</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Can Burak Çilingir</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://canburak.wordpress.com">
<dc:title>Can Burak Çilingir » gezegen-linux</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Can Kavaklıoğlu</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.cankavaklioglu.name.tr/guncelgunce/archives/linux/index.html">
<dc:title>Güncel günce</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Deniz Koçak</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://flyeater.wordpress.com">
<dc:title>King of Kebab » lkd</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Devrim Gündüz</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="">
<dc:title></dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Doruk Fişek</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://zzz.fisek.com.tr/seyir-defteri">
<dc:title>Sit Alanı'nın Seyir Defteri » Gezegen</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Ekin Meroğlu</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://ekin.fisek.com.tr/blog">
<dc:title>Sütlü Kahve</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Enver Altın</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://enveraltin.com/blog">
<dc:title>The truth about my life</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Erhan Ekici</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.erhanekici.com/blog">
<dc:title>bir delinin hatıra defteri » linux</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Erol Soyöz</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.soyoz.com/gunce">
<dc:title>Erol Soyöz | Dağıtık günce » linux gezegeni</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Erçin Eker</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.eker.info/blog">
<dc:title>The Useless Journal v4</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>FTP ekibi</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://gunluk.lkd.org.tr/ftp">
<dc:title>LKD FTP Ekibi</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Faik Uygur</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.faikuygur.com/blog">
<dc:title>Bir Takım Şeyler</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Fatih Arslan</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://blog.arsln.org">
<dc:title>Arslanlar Şehri » Gezegen</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Furkan Çalışkan</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.furkancaliskan.com/blog">
<dc:title>ozirus' » Gezegen</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Gökmen Göksel</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="">
<dc:title></dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Gürcan Öztürk</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://gurcanozturk.com">
<dc:title>gurcanozturk.com</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Gürer Özen</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://6kere9.com/blag/">
<dc:title>Indiana Jones' Diary</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Hakan Uygun</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.hakanuygun.com/blog">
<dc:title>hakan.uygun.yazıyor.* » Gezegen</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Hüseyin Uslu</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.huseyinuslu.net/_export/xhtml/topics_linux_feed">
<dc:title>Regular (S)expressions » linux</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>K. Deniz Öğüt</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://marenostrum.blogsome.com">
<dc:title>Mare Nostrum</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Kaya Oğuz</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="">
<dc:title></dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Kerem Can Karakaş</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.blockdiagram.net/blog">
<dc:title>Blog</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Koray Bostancı</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.koray.org/blog">
<dc:title>olmayana ergi..</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Kubilay Onur Güngör</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.kirmizivesiyah.org">
<dc:title>Kırmızı ve Siyah » Gezegen</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>LKD Seminer Duyuruları</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="">
<dc:title></dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>LKD YK</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://gunluk.lkd.org.tr/yk">
<dc:title>Linux Kullanıcıları Derneği Yönetim Kurulu » Günlük</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>LKD.org.tr</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.lkd.org.tr">
<dc:title>Haberler</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Levent Yalçın</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://leoman.gen.tr/blg">
<dc:title>Leoman® » LKD-Gezegen</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>M.Murat Akbaş</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://mmakbas.wordpress.com">
<dc:title>Mehmet Murat AKBAS » Gezegen</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>M.Tuğrul Yılmazer</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="">
<dc:title></dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Mehmet Büyüközer</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.sonofnights.com">
<dc:title>Mehmet Büyüközer</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Murat Hazer</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://mhazer.blogspot.com/search/label/gezegen">
<dc:title>Murat HAZER</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Murat Sağlam</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://panhaema.com">
<dc:title>panhaema.com</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Mustafa Karakaplan</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://web.inonu.edu.tr/~mkarakaplan/blog">
<dc:title>MuKa PlaNeT</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Necati Demir</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://blog.demir.web.tr/">
<dc:title>:(){ :|:&amp; };:</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Necdet Yücel</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://nyucel.blogspot.com/search/label/gezegen">
<dc:title>nyucel's diary</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Nesimi Acarca</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.nesimia.com">
<dc:title>nesimia.com</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Onur Tolga Şehitoğlu</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://sehitoglu.web.tr/gunluk">
<dc:title>Onur'sal » Bilgisayar</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Onur Yalazı</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.yalazi.org">
<dc:title>www.yalazi.org</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Oğuz Yarımtepe</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.loopbacking.info/blog">
<dc:title>import me » Gezegen</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Penguen-CG</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="">
<dc:title></dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Python-TR</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.python-tr.com">
<dc:title>Python - Java</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Pınar Yanardağ</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://pinguar.org/gunluk">
<dc:title>..the mythical woman month..</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Recai Oktaş</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="">
<dc:title></dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Serbülent Ünsal</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://nightwalkers.blogspot.com/">
<dc:title>Serbülent Ünsal'ın Web Günlüğü</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Serkan Altuntaş</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://serkan.gen.tr">
<dc:title>serkan » Linux Gezegeni</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Serkan Kaba</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://serkank.wordpress.com">
<dc:title>Serkan Kaba</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Serkan Kenar</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://serkan.feyvi.org/blog">
<dc:title>Kayıp Şehir / Serkan Kenar » debian</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Server Acim</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.serveracim.net/serendipity/">
<dc:title>Pardus, Müzik, Yaşam...</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Sinan Alyürük</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.ayder.org/gunluk">
<dc:title>Ayder Zamanı</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Stand</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="">
<dc:title></dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Talat Uyarer</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://talat.uyarer.com">
<dc:title>Huzur Mekanı</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Tonguç Yumruk</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://tonguc.name/blog">
<dc:title>Tonguç Yumruk'un Weblog'u</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Umur Erdinç</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://eumur.wordpress.com">
<dc:title>Umur'un Güncesi</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Web-CG</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://gunluk.lkd.org.tr/webcg">
<dc:title>Web Çalışma Grubu</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Ömer Fadıl Usta</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://bilisimlab.com/blog/index.php">
<dc:title>Bi'Log</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Özgürlükiçin.com</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://www.ozgurlukicin.com">
<dc:title>Özgürlük için... - Haberler</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
<foaf:member>
<foaf:Agent>
<foaf:name>Ümran Kamar</foaf:name>
<foaf:weblog>
<foaf:Document rdf:about="http://handlet.blogspot.com/">
<dc:title>Morning Glory</dc:title>
<rdfs:seeAlso>
<rss:channel rdf:about="" />
</rdfs:seeAlso>
</foaf:Document>
</foaf:weblog>
</foaf:Agent>
</foaf:member>
</foaf:Group>
</rdf:RDF>

52
www/generic.css Normal file
View File

@ -0,0 +1,52 @@
/* Basic tags */
a img {
border: 0px;
}
pre {
overflow: auto;
}
/* Anchors */
a {
color: #333638;
}
a:visited {
color: #757B7F;
}
a:active {
color: #ff0000;
}
/* Basic classes */
.none { /* to add paragraph spacing to various elements for ttys */
margin: 0px;
padding: 0px;
}
.invisible { /* stuff that should appear when this css isn't used */
margin: 0px;
border: 0px;
padding: 0px;
height: 0px;
visibility: hidden;
}
.left {
margin: 10px;
padding: 0px;
float: left;
}
.right {
margin: 10px;
padding: 0px;
float: right;
}
.center {
text-align: center;
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 418 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 375 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 189 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 287 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 569 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 174 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 642 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 476 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 486 B

BIN
www/images/bulusuyoruz.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

BIN
www/images/delicious.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 208 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 469 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

BIN
www/images/hdr-planet.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.0 KiB

View File

@ -0,0 +1,8 @@
<html>
<head>
<title></title>
<meta content="">
<style></style>
</head>
<body></body>
</html>

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.7 KiB

BIN
www/images/heads/meren.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.3 KiB

BIN
www/images/heads/nobody.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.1 KiB

BIN
www/images/heads/senlik.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.2 KiB

BIN
www/images/logo.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.0 KiB

Some files were not shown because too many files have changed in this diff Show More