Skip to content

Software Development News: .NET, Java, PHP, Ruby, Agile, Databases, SOA, JavaScript, Open Source

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Google Open Source Blog
Syndicate content
News about Google's Open Source projects and programs.
Updated: 2 hours 26 min ago

Open-sourcing DeepMind Lab

Wed, 12/07/2016 - 18:00
Originally posted on DeepMind Blog

DeepMind's scientific mission is to push the boundaries of AI, developing systems that can learn to solve any complex problem without needing to be taught how. To achieve this, we work from the premise that AI needs to be general. Agents should operate across a wide range of tasks and be able to automatically adapt to changing circumstances. That is, they should not be pre-programmed, but rather, able to learn automatically from their raw inputs and reward signals from the environment. There are two parts to this research program: (1)  designing ever-more intelligent agents capable of more-and-more sophisticated cognitive skills, and (2) building increasingly complex environments where agents can be trained and evaluated.

The development of innovative agents goes hand in hand with the careful design and implementation of rationally selected, flexible and well-maintained environments. To that end, we at DeepMind have invested considerable effort toward building rich simulated environments to serve as  “laboratories” for AI research. Now we are open-sourcing our flagship platform,  DeepMind Lab, so the broader research community can make use of it.

DeepMind Lab is a fully 3D game-like platform tailored for agent-based AI research. It is observed from a first-person viewpoint, through the eyes of the simulated agent. Scenes are rendered with rich science fiction-style visuals. The available actions allow agents to look around and move in 3D. The agent’s “body” is a floating orb. It levitates and moves by activating thrusters opposite its desired direction of movement, and it has a camera that moves around the main sphere as a ball-in-socket joint tracking the rotational look actions. Example tasks include collecting fruit, navigating in mazes, traversing dangerous passages while avoiding falling off cliffs, bouncing through space using launch pads to move between platforms, playing laser tag, and quickly learning and remembering random procedurally generated environments. An illustration of how agents in DeepMind Lab perceive and interact with the world can be seen below:

At each moment in time, agents observe the world as an image, in pixels, rendered from their own first-person perspective. They also may receive a reward (or punishment!) signal. The agent can activate its thrusters to move in 3D and can also rotate its viewpoint along both horizontal and vertical axes.

Artificial general intelligence research in DeepMind Lab emphasizes navigation, memory, 3D vision from a first person viewpoint, motor control, planning, strategy, time, and fully autonomous agents that must learn for themselves what tasks to perform by exploring their environment. All these factors make learning difficult. Each are considered frontier research questions in their own right. Putting them all together in one platform, as we have, represents a significant new challenge for the field.


DeepMind Lab is highly customisable and extendable. New levels can be authored with off-the-shelf editor tools. In addition, DeepMind Lab includes an interface for programmatic level-creation. Levels can be customised with gameplay logic, item pickups, custom observations, level restarts, reward schemes, in-game messages and more. The interface can be used to create levels in which novel map layouts are generated on the fly while an agent trains. These features are useful in, for example, testing how an agent copes with unfamiliar environments. Users will be able to add custom levels to the platform via GitHub. The assets will be hosted on GitHub alongside all the code, maps and level scripts. Our hope is that the community will help us shape and develop the platform going forward.



DeepMind Lab has been used internally at DeepMind for some time (example). We believe it has already had a significant impact on our thinking concerning numerous aspects of intelligence, both natural and artificial. However, our efforts so far have only barely scratched the surface of what is possible in DeepMind Lab. There are opportunities for significant contributions still to be made in a number of mostly still untouched research domains now available through DeepMind Lab, such as navigation, memory and exploration.

As well as facilitating agent evaluation, there are compelling reasons to think that it may be fundamentally easier to develop intelligence in a 3D world, observed from a first-person viewpoint, like DeepMind Lab. After all, the only known examples of general-purpose intelligence in the natural world arose from a combination of evolution, development, and learning, grounded in physics and the sensory apparatus of animals. It is possible that a large fraction of animal and human intelligence is a direct consequence of the richness of our environment, and unlikely to arise without it. Consider the alternative: if you or I had grown up in a world that looked like Space Invaders or Pac-Man, it doesn’t seem likely we would have achieved much general intelligence!

Read the full paper here.

Access DeepMind's GitHub repository here.

By Charlie Beattie, Joel Leibo, Stig Petersen and Shane Legg, DeepMind Team


Categories: Open Source

Why I contribute to Chromium

Mon, 12/05/2016 - 18:00
This is a guest post by Yoav Weiss who was recently recognized through the Google Open Source Peer Bonus Program for his work on the Chromium project. We invited Yoav to share about his work on our blog.

I was recently recognized by Google for my contributions to Chromium and wanted to write a few words on why I contribute to the project, other rendering engines and the web platform in general. I also wanted to share how it helped me evolve as a developer and why more people should contribute to the web platform for their own benefit.
The web platformI’ve written before about why I think the web platform is an extremely important asset for humanity and why we should make sure it'll thrive for years to come. It enables the distribution of knowledge to the corners of the earth and has fundamentally changed our world. Yet, compared to the amount of users (billions!) and web developers (millions), there are only a few hundred engineers working on maintaining and improving the platform itself.

That means that there are many aspects of the platform that are not as well maintained as they should be. We're at a real risk of a "tragedy of the commons" scenario, where despite usage and utility, the platform will collapse under its own weight because maintaining it is nobody's exclusive problem.
How I got startedPersonally, I had been working on web performance for well over a decade before I decided to get more involved and lend my hand in building the platform. For a large part of my professional life, browsers were black boxes. They were given to us by the browser gods and that's what we had to work with for the next few years. Their undocumented bugs and quirks became gospel, passed from senior engineers to their juniors.

Then at some point, that situation changed. Slowly but surely, open source browsers started picking up market share. No longer black boxes, we can actually see what happens on the inside!

I first got involved by joining the responsive images discussions and the Responsive Images Community Group. Then, I saw a tweet from RICG's chair calling to develop a prototype of the current proposal to prove its feasibility and value. And I jumped in.

I created a prototype using Chromium and WebKit, demoed it to anyone that was interested, worked on the proposals and argued the viability of the proposals' approach on the various mailing lists. Eventually, we were able to get some browser folks on board, improve the proposals and their fit to the rest of the platform, and I started working on an implementation.

The amount of work this required was larger than I expected. Eventually I managed to ship the feature in Blink and Chromium, and complete large parts of the implementation in WebKit as well. WOOT!
Success! Now what?After that project was done, I started looking into what I should do next. I was determined to continue working on browsers and find a gig that would let me do that. So I searched for an employer with a vested interest in the web and in making it faster, who would be happy to let me work on the platform's client - the web browser.

I found such an employer in Akamai, where I have been working as a Principal Architect ever since. As part of my job I'm working on our performance optimization features as well as performance-related browser features, making sure they make it into browsers in a timely fashion.
Why you should contribute, tooNow, chances are that if you're reading this, you're also relying on the web platform for your job in one way or another. Which means that there's a chance that it also makes sense for your organization to contribute to the web platform. Let’s explore the reasons:
1. Make sure work is done on features you care aboutIf you're like me, you love the web platform and the reach it provides you, but you're not necessarily happy with all of it. The web is great, but not perfect. Since browsers and web standards are no longer black boxes, you can help change that.

You can work on standards and browsers to change them to include your use-case. That's immense power at your fingertips: put in the work and the platform evolves for all the billions of users out there.

And you don’t have to wait years before new features can be used in production like with yesteryear's browser changes. With today’s browser update rates and progressive enhancement, you’ll probably be able to use changes in production within a few months.
2. Gain expertise that can help you do your job betterKnowing browser internals better can also give you superpowers in other parts of your job. Whenever questions about browser behavior arrive, you can take a peek into the source code and have concrete answers rather than speculation.

Keeping track of standards discussions give you visibility into new browser APIs that are coming along, so that you can opt to use those rather than settle for sub-optimal alternatives that are currently available.
3. Grow as an engineerWorking on browsers teaches you a lot about how things work under the surface and enables you to understand the internals of modern browsers, which are extremely complex machines. Further, this work allows you to get code reviews from the world's leading experts on these subjects. What better way to grow than to interact with the experts?
4. It's a fun and welcoming communityContributing to the web platform has been a great experience for me. Working with the Chromium project, in particular, is always great fun. The project is Google backed, but there are many external contributors and the majority of work and decisions are being done in the open. The people I've worked with are super friendly and happy to help. All in all, it's really fun!
Join usThe web needs more people working on it, and working on the web platform can be extremely beneficial to you, your career and your business.

If you're interested in getting started with web standards, the Discourse instance of the web Platform Incubator Community Group (or WICG for short) is where it's at (disclaimer: I'm co-chairing that group). For getting started with Chromium development, this is the post for you.

And most important, don't be afraid to ask the community. People on blink-dev and IRC are super friendly and will be happy to point you in the right direction.

So come on over and join the good cause. We'll be happy to have you!

By Yoav Weiss, Chromium contributor
Categories: Open Source

Announcing OSS-Fuzz: Continuous fuzzing for open source software

Thu, 12/01/2016 - 18:00
We are happy to announce OSS-Fuzz, a new Beta program developed over the past years with the Core Infrastructure Initiative community. This program will provide continuous fuzzing for select core open source software.

Open source software is the backbone of the many apps, sites, services, and networked things that make up “the internet.” It is important that the open source foundation be stable, secure, and reliable, as cracks and weaknesses impact all who build on it.

Recent security stories confirm that errors like buffer overflow and use-after-free can have serious, widespread consequences when they occur in critical open source software. These errors are not only serious, but notoriously difficult to find via routine code audits, even for experienced developers. That's where fuzz testing comes in. By generating random inputs to a given program, fuzzing triggers and helps uncover errors quickly and thoroughly.

In recent years, several efficient general purpose fuzzing engines have been implemented (e.g. AFL and libFuzzer), and we use them to fuzz various components of the Chrome browser. These fuzzers, when combined with Sanitizers, can help find security vulnerabilities (e.g. buffer overflows, use-after-free, bad casts, integer overflows, etc), stability bugs (e.g. null dereferences, memory leaks, out-of-memory, assertion failures, etc) and sometimes even logical bugs.

OSS-Fuzz’s goal is to make common software infrastructure more secure and stable by combining modern fuzzing techniques with scalable distributed execution. OSS-Fuzz combines various fuzzing engines (initially, libFuzzer) with Sanitizers (initially, AddressSanitizer) and provides a massive distributed execution environment powered by ClusterFuzz.
Early successesOur initial trials with OSS-Fuzz have had good results. An example is the FreeType library, which is used on over a billion devices to display text (and which might even be rendering the characters you are reading now). It is important for FreeType to be stable and secure in an age when fonts are loaded over the Internet. Werner Lemberg, one of the FreeType developers, was an early adopter of OSS-Fuzz. Recently the FreeType fuzzer found a new heap buffer overflow only a few hours after the source change:

ERROR: AddressSanitizer: heap-buffer-overflow on address 0x615000000ffa
READ of size 2 at 0x615000000ffa thread T0
SCARINESS: 24 (2-byte-read-heap-buffer-overflow-far-from-bounds)
   #0 0x885e06 in tt_face_vary_cvtsrc/truetype/ttgxvar.c:1556:31

OSS-Fuzz automatically notified the maintainer, who fixed the bug; then OSS-Fuzz automatically confirmed the fix. All in one day! You can see the full list of fixed and disclosed bugs found by OSS-Fuzz so far.
Contributions and feedback are welcomeOSS-Fuzz has already found 150 bugs in several widely used open source projects (and churns ~4 trillion test cases a week). With your help, we can make fuzzing a standard part of open source development, and work with the broader community of developers and security testers to ensure that bugs in critical open source applications, libraries, and APIs are discovered and fixed. We believe that this approach to automated security testing will result in real improvements to the security and stability of open source software.

OSS-Fuzz is launching in Beta right now, and will be accepting suggestions for candidate open source projects. In order for a project to be accepted to OSS-Fuzz, it needs to have a large user base and/or be critical to Global IT infrastructure, a general heuristic that we are intentionally leaving open to interpretation at this early stage. See more details and instructions on how to apply here.

Once a project is signed up for OSS-Fuzz, it is automatically subject to the 90-day disclosure deadline for newly reported bugs in our tracker (see details here). This matches industry’s best practices and improves end-user security and stability by getting patches to users faster.

Help us ensure this program is truly serving the open source community and the internet which relies on this critical software, contribute and leave your feedback on GitHub.

By Mike Aizatsky, Kostya Serebryany (Software Engineers, Dynamic Tools); Oliver Chang, Abhishek Arya (Security Engineers, Google Chrome); and Meredith Whittaker (Open Research Lead).
Categories: Open Source

Docker + Dataflow = happier workflows

Wed, 11/30/2016 - 18:00
When I first saw the Google Cloud Dataflow monitoring UI -- with its visual flow execution graph that updates as your job runs, and convenient links to the log messages -- the idea came to me. What if I could take that UI, and use it for something it was never built for? Could it be connected with open source projects aimed at promoting reproducible scientific analysis, like Common Workflow Language (CWL) or Workflow Definition Language (WDL)?
Screenshot of a Dockerflow workflow for DNA sequence analysis.
In scientific computing, it’s really common to submit jobs to a local high-performance computing (HPC) cluster. There are tools to do that in the cloud, like Elasticluster and Starcluster. They replicate the local way of doing things, which means they require a bunch of infrastructure setup and management that the university IT department would otherwise do. Even after you’re set up, you still have to ssh into the cluster to do anything. And then there are a million different choices for workflow managers, each unsatisfactory in its own special way.

By day, I’m a product manager. I hadn’t done any serious coding in a few years. But I figured it shouldn’t be that hard to create a proof-of-concept, just to show that the Apache Beam API that Dataflow implements can be used for running scientific workflows. Now, Dataflow was created for a different purpose, namely, to support scalable data-parallel processing, like transforming giant data sets, or computing summary statistics, or indexing web pages. To use Dataflow for scientific workflows would require wrapping up shell steps that launch VMs, run some code, and shuttle data back and forth from an object store. It should be easy, right?

It wasn’t so bad. Over the weekend, I downloaded the Dataflow SDK, ran the wordcount examples, and started modifying. I had a “Hello, world” proof-of-concept in a day.

To really run scientific workflows would require more, of course. Varying VM shapes, a way to pass parameters from one step to the next, graph definition, scattering and gathering, retries. So I shifted into prototyping mode.

I created a new GitHub project called Dockerflow. With Dockerflow, workflows can be defined in YAML files. They can also be written in pretty compact Java code. You can run a batch of workflows at once by providing a CSV file with one row per workflow to define the parameters.

Dataflow and Docker complement each other nicely:

  • Dataflow provides a fully managed service with a nice monitoring interface, retries,  graph optimization and other niceties.
  • Docker provides portability of the tools themselves, and there's a large library of packaged tools already available as Docker images.

While Dockerflow supports a simple YAML workflow definition, a similar approach could be taken to implement a runner for one of the open standards like CWL or WDL.

To get a sense of working with Dockerflow, here’s “Hello, World” written in YAML:

defn:
  name: HelloWorkflow
steps:
- defn:
    name: Hello
    inputParameters:
      name: message
      defaultValue: Hello, World!
    docker:
      imageName: ubuntu
      cmd: echo $message

And here’s the same example written in Java:

public class HelloWorkflow implements WorkflowDefn {
  @Override
  public Workflow createWorkflow(String[] args) throws IOException {
    Task hello =
        TaskBuilder.named("Hello").input("message", “Hello, World!”).docker(“ubuntu”).script("echo $message").build();
    return TaskBuilder.named("HelloWorkflow").steps(hello).args(args).build();
  }
}

Dockerflow is just a prototype at this stage, though it can run real workflows and includes many nice features, like dry runs, resuming failed runs from mid-workflow, and, of course, the nice UI. It uses Cloud Dataflow in a way that was never intended -- to run scientific batch workflows rather than large-scale data-parallel workloads. I wish I’d written it in Python rather than Java. The Dataflow Python SDK wasn’t quite as mature when I started.

Which is all to say, it’s been a great 20% project, and the future really depends on whether it solves a problem people have, and if others are interested in improving on it. We welcome your contributions and comments! How do you run and monitor scientific workflows today?

By Jonathan Bingham, Google Genomics and Verily Life Sciences
Categories: Open Source

Google Summer of Code 2016 wrap-up: STE||AR

Tue, 11/29/2016 - 18:00
This is part of a series of guest posts from students, mentors and organization administrators who participated in Google Summer of Code (GSoC) 2016. GSoC is an annual program which pairs university students with mentors to work on open source software.


This summer the STE||AR Group was proud to mentor four students through Google Summer of Code. These students worked on a variety of projects which helped improve our software, HPX. This library is a distributed C++ runtime system which supports a standards compliant API and helps users scale their applications across thousands of machines.

The improvements to the code base will help our team and users of HPX around the world. A summary of our students’ projects:

Parsa Amini – HPX Debugger

Developing a better distributed debugging tool is essential to increase the programmability of HPX. Parsa’s project, Scimitar, aims to facilitate the debugging process for HPX programmers by extending the features of GDB, an existing debugger. The project then complements it with new commands for easier switching between localities across clusters, HPX thread debugging, awareness of internal HPX data structures, and semi-automated preparation for distributed debugging sessions. Additional functionality such as locating an object and viewing the queue information on each core is provided through using API provided by HPX itself. His work can be found on GitHub.

Aalekh Nigam – Implement a Map/Reduce Framework

This project aimed to expose a Map/Reduce programming model over HPX. During the summer, Aalekh was able to develop a single node implementation of HPXflow (map/reduce programming model) and laid the groundwork for the further multi-node version with database support. Although the initial task was limited to implementing the Map/Reduce model, he was also able to implement an improved dataflow model as well.

Minh-Khanh Do - Working on Parallel Algorithms for HPX::Vector

Minh-Khanh’s task was to take the parallel algorithms and add the functionality required to work on the segmented hpx::vector. Under his mentor John Biddscombe, he implemented the segmented_fill algorithm, which was successfully merged into the main codebase. Additionally, Minh-Khanh implemented the segmented_scan algorithm which includes inclusive and exclusive_scan. These changes are included in a pull request and have been merged. Using the segmented scan algorithm it is possible to perform tasks such as evaluating polynomials and to implement other algorithms such as quicksort.

Satyaki Upadhyay - Plugin Mechanism for thread schedulers in HPX

In HPX, schedulers are statically linked and must be built at compile-time. Satyaki’s project involved converting this statically linked scheme into a plugin system which would allow arbitrary schedulers to be dynamically loaded. These changes bring several benefits. They provide a layer of abstraction and follow the open/closed principle of software design as well as allowing developers to write their own custom schedulers while conforming to a uniform API. The project proceeded in two steps. The first involved the creation of plugin modules of the schedulers and registering them with HPX. The second step was to implement the loading and subsequent use of the chosen scheduler.

We would like to thank our students and mentors for the time that they have contributed to HPX this summer. In addition, we would like to thank Google for the opportunity that they provided the STE||AR Group to work with developers around the globe as well as the ability for students to interact with vibrant open source projects worldwide.

By Adrian Serio, Organization Administrator for The STE||AR Group
Categories: Open Source

It’s that time again: Google Code-in starts today!

Mon, 11/28/2016 - 21:18
Today marks the start of the 7th year of Google Code-in (GCI), our pre-university contest introducing students to open source development. GCI takes place entirely online and is open to students between the ages of 13 and 17 around the globe.
The concept is simple: complete bite-sized tasks (at your own pace) created by 17 participating open source organizations on topic areas you find interesting:

  • Coding
  • Documentation/Training
  • Outreach/Research
  • Quality Assurance
  • User Interface

Tasks take an average of 3-5 hours to complete and include the guidance of a mentor to help along the way. Complete one task? Get a digital certificate. Three tasks? Get a sweet Google t-shirt. Finalists get a hoodie. Grand Prize winners get a trip to Google headquarters in California.

Over the last 6 years, 3213 students from 99 countries have successfully completed tasks in GCI. Intrigued? Learn more about GCI by checking out our rules and FAQs. And please visit our contest site and read the Getting Started Guide.

Teachers, if you are interested in getting your students involved in Google Code-in you can find resources here to help you get started.

By Mary Radomile, Open Source Programs Office
Categories: Open Source

Stories from Google Code-in: Sugar Labs and Systers

Wed, 11/23/2016 - 18:00
Google Code-in (GCI) is our annual contest that gives students age 13 to 17 experience in computer science through contributions to open source projects. This blog post is the final installment in our series reflecting on the experiences of Google Code-in 2015 grand prize winners. Be sure to check out the first three posts.

The Google Code-in contest begins on Monday, November 28th at 9am PT for students. Right now you can learn more about the 17 mentoring organizations that students will be able to work with by going to the contest site. To get students excited for GCI 2016, we’re sharing three more stories from GCI 2015 grand prize winners. These stories illustrate how global the competition is, the challenges students face and the valuable skills they learn working with these open source organizations.

IMG_20160614_152138.jpgA group of Google Code-in 2015 mentors joined grand prize winners for a day of exploring
San Francisco including the iconic Golden Gate Bridge.First up is the story of Ezequiel Pereira, a student from Uruguay who worked with Sugar Labs. Sugar Labs is the organization behind Sugar, the operating system for the OLPC XO-1 which the Uruguayan government has distributed to public primary schools. The XO-1 was Ezequiel’s first computer.
Ezequiel’s curiosity in computer science was piqued when a technician came to his school to solve a simple bug that was affecting most XO’s. The technician used the command line which, up to that point, Ezequiel thought was useless. Realizing that the command line offered him a lot of power, Ezequiel began his exploration.
He discovered Google Code-in by reading about another Uruguayan teenager, one who was a grand prize winner in Google Code-in 2012. Ezequiel jumped into the contest and participated for several years expanding his skills before finishing as a grand prize winner of Google Code-in 2015. Along the way Ezequiel got comfortable with IRC and began helping other students, even finding new friends among along the way.

Next we have Sara Du from the United States. Sara had been coding for six months when she discovered Google Code-in on Christmas Eve, halfway through the competition. She found lots of interesting tasks, but had trouble finding the right organization to focus on before selecting Systers.
Like many students, Sara was able to quickly jump into code but spent a couple days just getting acquainted with Git and GitHub. This is something we hear from a lot of students and it’s just one of the skills that they pick up by working on real-world projects, along with testing and communication.
Another challenge Sara faced was working with a mentor 16 time zones away from her, which meant that correspondence would often take a day or two. While this was a challenge, she found the long feedback loop encouraged her to get on the Slack channel and reach out to other contributors for help. Ultimately, this made her even more a part of the Systers community.
Sara said Google Code-in was one of the most awesome experiences she’s had and has this advice to offer future participants: “The organization you end up working with has a vibrant community of hackers from everywhere; try to interact with them and you will be sure to learn from others as they will from you!”

Last, but certainly not least, we have Ahmed Sabie, a student from Canada who also worked with Systers. Ahmed started coding competitively several years ago, focusing on graph theory, dynamic programming and data structures. He loved the problem solving, but knew that these competitions took place in a sandbox. To grow, Ahmed would need to explore.
Enter Google Code-in. Ahmed was most comfortable with Python and saw that the Systers Volunteer Management System used that language, so that’s where he started.
Ahmed, like many students and even professional developers, spent much of his first week setting up his development environment. It was a grueling process but with the help of search and the people in the Systers Slack channel he was finally able to see the project’s login screen.
As he completed easy tasks, Ahmed moved on to more difficult tasks and began to help other students, many who got stuck on the same issues he had encountered earlier. Ahmed found that each task provided an opportunity to stretch his skills a little bit more. He was excited about how quickly he was learning. Though Ahmed learned a lot on his own, he says the vast majority of what he learned was through the help of other people -- students, mentors and other project contributors -- and that he felt like he was truly a part of the Systers community by the end of the process. 
Ahmed’s favorite task was an appropriate finale for the competition: he added multilingual support to an application he had worked on and added the French translation.“Overall, Google Code-in was the experience of a lifetime. It set me up for the future, by teaching me relevant and critical skills necessary in software development. I have contributed to a good cause, and met fantastic mentors and friends along the way. Open source development is not a onetime thing, it is an ongoing process. I hope to continue to be part of it, and to me it is a form of volunteering and giving back to the community.” - Ahmed Sabie
With that, we conclude our series of posts reflecting on Google Code-in 2015. We thank Ezequiel, Sara, Ahmed and all the other participants for sharing their stories and contributing to the software we all rely on. We hope you will join us in carrying on the tradition with Google Code-in 2016!
By Josh Simmons, Open Source Programs Office
Categories: Open Source

Google Summer of Code 2016 wrap-up: Linux XIA

Fri, 11/18/2016 - 19:00
We're sharing guest posts from students, mentors and organization administrators who participated in Google Summer of Code 2016. This is the fifth post in that series and there are more on the way.


Linux XIA is the native implementation of XIA, a meta network architecture that supports evolution of all of its components, which we call “principals,” and promotes interoperability between these principals. It is the second year that our organization, Boston University / XIA, has participated in Google Summer of Code (GSoC), and this year we received 31 proposals from 8 countries.

Our ideas list this year focused on upgrading key forwarding data structures to their best known versions. Our group chose the most deserving students for each of the following projects:

Accelerating the forwarding speed of the LPM principal with poptrie

Student André Ferreira Eleuterio and mentor Cody Doucette implemented the first version of the LPM principal in Linux XIA for GSoC 2015. The LPM principal enables Linux XIA to leverage routing tables derived from BGP, OSPF, IS-IS and any other IP routing protocol to forward XIA packets natively, that is, without encapsulation in IP. For GSoC 2016, student Vaibhav Raj Gupta from India partnered with mentor Cody Doucette to speed up the LPM principal by employing a state-of-the-art data structure to find the longest prefix matching using general purpose processors: poptrie.

Upgrading the FIB hash table of principals to the relativistic hash table

Principals that rely on routing flat names have used a resizable hash table that supports lockless readers since 2011. While this data structure was unique in 2011, in the same year, relativistic hash tables were published. The appeal to upgrade to relativistic hash tables was twofold: reduced memory footprint per hashed element, and the fact they were implemented in the Linux kernel in 2014. Student Sachin Paryani, also from India, worked with mentor Qiaobin Fu to replace our resizable hash table with the relativistic hash table.

Google Summer of Code nurtures a brighter future. Thanks to GSoC, our project has received important code contributions, and our community has been enlarged. It was rewarding to learn that two of our GSoC students have decided to pursue graduate school after their GSoC experience with us: Pranav Goswami (2015) and Sachin Paryani (2016). We hope these examples will motivate other students to do their best because the world is what we make of it.

By Michel Machado, Boston University / XIA organization administrator
Categories: Open Source

Google Summer of Code 2016 wrap-up: Debian

Wed, 11/16/2016 - 18:30
This is the fourth post in our series of wrap-ups and guest posts from participants reflecting on Google Summer of Code (GSoC) 2016. Explore the first three posts and stay tuned for more wrap-ups and announcements.



Debian, founded in 1993, is a project aimed at building a 100% free and open source “Universal Operating System.” It’s a volunteer-driven project based on Linux, FreeBSD
and Hurd kernels for devices ranging from mobile phones to large clusters.

Being a wide umbrella project, Debian offered a diverse array of opportunities for Google Summer of Code (GSoC) students. For example, students worked on making our distribution more trustworthy (reproducible builds), porting our OS to Android devices and improving infrastructure for developers. This year I joined the Debian Real-Time Communications (RTC) mentoring team which engaged 13 students to improve voice, video and chat communication with free software.

WebRTC, an open standard enabling real-time video and audio communication in the browser, is central to this work. It was used to create JSCommunicator, an embeddable WebRTC phone developed in HTML, CSS and JavaScript, supporting voice, video and chat using SIP over WebSockets. A GSoC 2014 student, Juliana Louback, significantly enhanced JSCommunicator during her summer with Debian.

JSCommunicator is now being adapted for use with content management systems (CMS) and blogging platforms, making it easy to embed rich communication features in existing systems. It was this work that our current GSoC students built on.

This year I mentored GSoC student Mesut Can Gurle who used DruCall, a Drupal module for integrating JSCommunicator, as inspiration for building WPCall for WordPress. With this new plug-in, standards-based voice, video and chat is now available on the world’s two most popular CMS without the need for browser plugins.

The way WPCall was extrapolated from the DruCall plugin provides a pattern that other communities can follow to rapidly create WebRTC plugins for their own web frameworks. The JSCommunicator Integration Guide provides step-by-step instructions that developers and future students can follow. If you’re interested in learning more about significant developments in this space, please subscribe to the Free-RTC Announce mailing list and follow planet.freertc.org.

This was my first year as a GSoC mentor and I had such a great experience. It was rewarding working with Mesut on achieving his goals and we learned a lot along the way. Despite some setbacks (he narrowly missed a bombing as his country experienced an attempted coup), Mesut has made valuable contributions to free software.

As the summer wound down, I received an invitation to participate in a t-shirt design contest for the annual Mentor Summit. I thought it would be fun to try and put together a design focusing on GSoC’s key values.

The front of the t-shirt shows developers from all over the world collaborating on free software, representing the amazing scope and diversity of the projects. On the back, above the clouds, a space shuttle symbolizes what’s achieved through GSoC.

A group of attendees wearing the Google Summer of Code 2016 Mentor Summit t-shirt.
Happily, my design was selected and it was great seeing all the attendees wearing it at the Mentor Summit!

By Bruno Magalhães, Mentor for Debian
Categories: Open Source

ETC2Comp: fast texture compression for games and VR

Mon, 11/14/2016 - 19:00
For mobile game and VR developers the ETC2 texture format has become an increasingly valuable tool for texture compression. It produces good on-GPU sizes (it stays compressed in memory) and higher quality textures (compared to its ETC1 counterpart).

These benefits come with a significant downside, however: ETC2 textures take significantly longer to compress than their ETC1 counterparts. As adoption of the ETC2 format increases in a project, so do build times. As such, developers have had to make the classic choice between quality and time.

We wanted to eliminate the need for developers to make that choice, so we’ve released ETC2Comp, a fast and high quality ETC2 encoder for games and VR developers.

ETC2 takes a long time to compress textures because the format defines a large number of possible combinations for encoding a block in the texture. To find the most perfect, highest quality compressed image means brute-forcing this incredibly large number of combinations, which clearly is not a time efficient option.

We designed ETC2Comp to get the same visual results at much faster speeds by deploying a few optimization techniques:

Directed Block Search. Rather than a brute-force search, ETC2Comp uses a much more limited, targeted search for the best encoding for a given block. ETC2Comp comes with a precomputed set of archetype blocks, where each archetype is associated with a sorted list of the ETC2 block format types that provide its best encodings. During the actual compression of a texture, each block is initially assigned an archetype, and multiple passes are done to test the block against its block format list to find the best encoding. As a result, the best option can be found much quicker than with a brute-force method.

Full effort setting. During each pass of the encoding process, all the blocks of the image are sorted by their visual quality (worst-looking to best-looking). ETC2Comp takes an effort parameter whose value specifies what percentage of the blocks to update during each pass of encoding. An effort value of 25, for instance, means that on each pass, only the 25% worst looking blocks are tested against the next format in their archetypes' format-chains. The result is a tradeoff between optimizing blocks that already look good, and the time it takes to do it.

Highly multi-threaded code. Since blocks can be evaluated independently during each pass, it’s straightforward to apply multithreading to the work. During encoding ETC2comp can take advantage of available parallel threads, and it even accepts a jobs parameter, where you can define exactly the number of threads you’d like it to use... in case you have a 256 core machine.

Check out the code on GitHub to get started with ETC2Comp and let us know what you think. You can use the tool from the command line or embed the C++ library in your project. If you want to know more about what’s going on under the hood, check out this blog post.

By Colt McAnlis, Developer Advocate
Categories: Open Source

Open source visualization of GPS displacements for earthquake cycle physics

Fri, 11/11/2016 - 19:30
The Earth’s surface is moving, ever so slightly, all the time. This slow, small, but persistent movement of the Earth's crust is responsible for the formation of mountain ranges, sudden earthquakes, and even the positions of the continents. Scientists around the world measure these almost imperceptible movements using arrays of Global Navigation Satellite System (GNSS) receivers to better understand all phases of an earthquake cycle—both how the surface responds after an earthquake, and the storage of strain energy between earthquakes.

To help researchers explore this data and better understand the Earthquake cycle, we are releasing a new, interactive data visualization which draws geodetic velocity lines on top of a relief map by amplifying position estimates relative to their true positions. Unlike existing approaches, which focus on small time slices or individual stations, our visualization can show all the data for a whole array of stations at once. Open sourced under an Apache 2 license, and available on GitHub, this visualization technique is a collaboration between Harvard’s Department of Earth and Planetary Sciences and Google's Machine Perception and Big Picture teams.

Our approach helps scientists quickly assess deformations across all phases of the earthquake cycle—both during earthquakes (coseismic) and the time between (interseismic). For example, we can see azimuth (direction) reversals of stations as they relate to topographic structures and active faults. Digging into these movements will help scientists vet their models and their data, both of which are crucial for developing accurate computer representations that may help predict future earthquakes.

Classical approaches to visualizing these data have fallen into two general categories: 1) a map view of velocity/displacement vectors over a fixed time interval and 2) time versus position plots of each GNSS component (longitude, latitude and altitude).

Examples of classical approaches. On the left is a map view showing average velocity vectors over the period from 1997 to 2001[1]. On the right you can see a time versus eastward (longitudinal) position plot for a single station.
Each of these approaches have proved to be informative ways to understand the spatial distribution of crustal movements and the time evolution of solid earth deformation. However, because geodetic shifts happen in almost imperceptible distances (mm) and over long timescales, both approaches can only show a small subset of the data at any time—a condensed average velocity per station, or a detailed view of a single station, respectively. Our visualization enables a scientist to see all the data at once, then interactively drill down to a specific subset of interest.

Our visualization approach is straightforward; by magnifying the daily longitude and latitude position changes, we show tracks of the evolution of the position of each station. These magnified position tracks are shown as trails on top of a shaded relief topography to provide a sense of position evolution in geographic context.

To see how it works in practice, let’s step through an an example. Consider this tiny set of longitude/latitude pairs for a single GNSS station, with the differing digits shown in bold:

.table_with_border, .table_with_border tr, .table_with_border td { border: 1px solid black; } .table_with_border td { padding: 0.5em; }
Day IndexLongitudeLatitude0139.0699040734.9497578971139.0699040034.9497578822139.0699041334.9497579413139.0699040934.9497579214139.0699041334.949757904
If we were to draw line segments between these points directly on a map, they’d be much too small to see at any reasonable scale. So we take these minute differences and multiply them by a user-controlled scaling factor. By default this factor is 105.5 (about 316,000x).


To help the user identify which end is the start of the line, we give the start and end points different colors and interpolate between them. Blue and red are the default colors, but they’re user-configurable. Although day-to-day movement of stations may seem erratic, by using this method, one can make out a general trend in the relative motion of a station.
Close-up of a single station’s movement during the three year period from 2003 to 2006.However, static renderings of this sort suffer from the same problem that velocity vector images do; in regions with a high density of GNSS stations, tracks overlap significantly with one another, obscuring details. To solve this problem, our visualization lets the user interactively control the time range of interest, the amount of amplification and other settings. In addition, by animating the lines from start to finish, the user gets a real sense of motion that’s difficult to achieve in a static image.

We’ve applied our new visualization to the ~20 years of data from the GEONET array in Japan. Through it, we can see small but coherent changes in direction before and after the great 2011 Tohoku earthquake.
GPS data sets (in .json format) for both the GEONET data in Japan and the Plate Boundary Observatory (PBO) data in the western US are available at earthquake.rc.fas.harvard.edu.This short animation shows many of the visualization’s interactive features. In order:
  1. Modifying the multiplier adjusts how significantly the movements are magnified.
  2. We can adjust the time slider nubs to select a particular time range of interest.
  3. Using the map controls provided by the Google Maps JavaScript API, we can zoom into a tiny region of the map.
  4. By enabling map markers, we can see information about individual GNSS stations.
By focusing on a stations of interest, we can even see curvature changes in the time periods before and after the event.
Station designated 960601 of Japan’s GEONET array is located on the island of Mikura-jima. Here we see the period from 2006 to 2012, with movement magnified 105.1 times (126,000x).To achieve fast rendering of the line segments, we created a custom overlay using THREE.js to render the lines in WebGL. Data for the GNSS stations is passed to the GPU in a data texture, which allows our vertex shader to position each point on-screen dynamically based on user settings and animation.

We’re excited to continue this productive collaboration between Harvard and Google as we explore opportunities for groundbreaking, new earthquake visualizations. If you’d like to try out the visualization yourself, follow the instructions at earthquake.rc.fas.harvard.edu. It will walk you through the setup steps, including how to download the available data sets. If you’d like to report issues, great! Please submit them through the GitHub project page.

Acknowledgments

We wish to thank Bill Freeman, a researcher on Machine Perception, who hatched the idea and developed the initial prototypes, and Fernanda Viégas and Martin Wattenberg of the Big Picture Team for their visualization design guidance.

References

[1] Loveless, J. P., and Meade, B. J. (2010). Geodetic imaging of plate motions, slip rates, and partitioning of deformation in Japan, Journal of Geophysical Research.

By Jimbo Wilson, Software Engineer, Big Picture Team and Brendan Meade, Professor, Harvard Department of Earth and Planetary Sciences

Categories: Open Source

Celebrating TensorFlow’s First Year

Wed, 11/09/2016 - 18:13
Originally posted on Google Research blog

It has been an eventful year since the Google Brain Team open-sourced TensorFlow to accelerate machine learning research and make technology work better for everyone. There has been an amazing amount of activity around the project: more than 480 people have contributed directly to TensorFlow, including Googlers, external researchers, independent programmers, students, and senior developers at other large companies. TensorFlow is now the most popular machine learning project on GitHub.


With more than 10,000 commits in just twelve months, we’ve made numerous performance improvements, added support for distributed training, brought TensorFlow to iOS and Raspberry Pi, and integrated TensorFlow with widely-used big data infrastructure. We’ve also made TensorFlow accessible from Go, Rust, and Haskell, released state-of-the-art image classification models – and answered thousands of questions on GitHub, StackOverflow, and the TensorFlow mailing list along the way.

At Google, TensorFlow supports everything from large-scale product features to exploratory research. We recently launched major improvements to Google Translate using TensorFlow (and Tensor Processing Units, which are special hardware accelerators for TensorFlow). Project Magenta is working on new reinforcement learning-based models that can produce melodies, and a visiting PhD student recently worked with the Google Brain team to build a TensorFlow model that can automatically interpolate between artistic styles. DeepMind has also decided to use TensorFlow to power all of their research – for example, they recently produced fascinating generative models of speech and music based on raw audio.

We’re especially excited to see how people all over the world are using TensorFlow. For example:

  • Australian marine biologists are using TensorFlow to find sea cows in tens of thousands of hi-res photos to better understand their populations, which are under threat of extinction. 
  • An enterprising Japanese cucumber farmer trained a model with TensorFlow to sort cucumbers by size, shape, and other characteristics.
  • Radiologists have adapted TensorFlow to identify signs of Parkinson’s disease in medical scans.
  • Data scientists in the Bay Area have rigged up TensorFlow and the Raspberry Pi to keep track of the Caltrain.

We’re committed to making sure TensorFlow scales all the way from research to production and from the tiniest Raspberry Pi all the way up to server farms filled with GPUs or TPUs. But TensorFlow is more than a single open-source project – we’re doing our best to foster an open-source ecosystem of related software and machine learning models around it:

  • The TensorFlow Serving project simplifies the process of serving TensorFlow models in production.
  • TensorFlow “Wide and Deep” models combine the strengths of traditional linear models and modern deep neural networks. 
  • For those who are interested in working with TensorFlow in the cloud, Google Cloud Platform recently launched Cloud Machine Learning, which offers TensorFlow as a managed service.

Furthermore, TensorFlow’s repository of models continues to grow with contributions from the community, with more than 3000 TensorFlow-related repositories are listed on GitHub alone! To participate in the TensorFlow community, you can follow our new Twitter account (@tensorflow), find us on GitHub, ask and answer questions on StackOverflow, and join the community discussion list.

Thanks very much to all of you who have already adopted TensorFlow in your cutting-edge products, your ambitious research, your fast-growing startups, and your school projects; special thanks to everyone who has contributed directly to the codebase. In collaboration with the global machine learning community, we look forward to making TensorFlow even better in the years to come!

By Zak Stone, Product Manager for TensorFlow
Categories: Open Source

Google Summer of Code 2016 blog post round-up

Tue, 11/08/2016 - 16:00
We’re publishing guest posts from Google Summer of Code (GSoC) students, mentors and organizations every week and more are coming. Many have already written GSoC wrap-up posts on their own blogs, so we’ve rounded them up for you to explore.


Static types in Python, oh my(py)!” by Tim Abbott, org admin for Zulip
“We posted mypy annotations as one of our project ideas for Google Summer of Code (GSoC). We found an incredible student, Eklavya Sharma, for the project. Eklavya did the vast majority of the hard work of annotating Zulip. Amazingly, he also found the time during the summer to migrate Zulip to use virtualenvs and then upgrade Zulip to Python 3!”


A road from Google Summer of Code student to organization administrator” by Araz Abishov, org admin for HISP
“Google has created unprecedented opportunity both for young developers and open source communities, which I think everyone should take advantage of. GSoC is more than just a three months internship, and I hope that this post will be a good example of how it can change anyone’s life.”


Summer of Code 2016: Wrapping it up” by Martin Braun, org admin for GNU Radio
“This summer was a great summer in terms of student participation. All three students will be presenting their work (either in person, or via poster) at this year’s GNU Radio Conference in Boulder, Colorado.”


2016 Google Summer of Code Wrap-Up” by Ed Cable, org admin for Mifos Initiative
“Each year GSoC continues to unite and grow our community in different ways. Once again, we received incredibly valuable contributions to our Mifos X web and mobile clients this summer; most importantly we have cultivated numerous passionate contributors that will be a part of our community long into the future.”


Road to GSoC 2016” by Minh Chu, student who worked on Neverland for KDE
“I was nervous about choosing a project. So many projects and requirements! After many hours, I finally decided to write a proposal for KDE’s Neverland Theme Builder and was accepted.”


Git Rev News” by Christian Couder, mentor for Git
“Such performance improvements as well as the code consolidations around the sequencer are of course very nice. It is interesting and satisfying to see that they are the result of building on top of previous work over the years by GSoC students, mentors and reviewers.”


Google Summer of Code 2016 Conclusion” by Amine Khaldi, org admin for ReactOS
“Students stumble upon many of the same difficulties ReactOS' own senior developers encountered during their early days, including that ever painful but necessary step to using a proper debugger instead of relying on printf statements in the code.”


My Journey in Open Source / How to Get Started Contributing” by Nelson Liu, student who worked on scikit-learn for PSF
“The best way to get started is to simply jump in! There are a myriad of ways to contribute to an open source project. Obviously, writing code to fix bugs, add new features, or enhance existing ones are useful. However, you don't have to write code to help out!”


Lasp and the Google Summer of Code” by Borja o’Cook, student who worked on Lasp for BEAM Community
“All in all, it's been an amazing experience. I've received a lot of support from my mentors and teammates; the Lasp team is full of incredible people.”


GSoC 2016 Students in TEAMMATES” by Damith C. Rajapakse, org admin for TEAMMATES
“We had our biggest batch of students (7 students) in GSoC 2016, selected from 93 proposals, and representing 4 countries and 4 universities, working on TEAMMATES (an online feedback management system for education) and related sub projects.”


User-friendly encryption now in Drupal 8!” by Colan Schwartz, mentor for Drupal
“There were several students interested in the topic, and wrote proposals to match. Talha Paracha's excellent proposal was accepted, and he began in earnest. With Adam Bergstein (nerdstein) and I mentoring him, Talha successfully worked through all phases of the project.”


GSoC with Shogun” by Sanuj Sharma, student who worked on Shogun
“This was an excellent learning experience for me and I got to work with people from different countries (UK, Russia, Singapore, Germany) and cultures. I highly recommend students to participate in Google Summer of Code by looking for projects that interest them because having open source experience is highly beneficial, especially for programmers.”


We have wrap-up posts coming out every week so stay tuned for more. If you’re interested in participating in Google Summer of Code 2017, you can find details here.

By Josh Simmons, Open Source Programs Office
Categories: Open Source

Announcing the Google Code-in 2016 mentor organizations

Mon, 11/07/2016 - 18:30
We’re excited to introduce the 17 open source organizations that are participating as mentor organizations for Google Code-in 2016. The contest, now in its seventh year, gives 13-17 year old pre-university students the opportunity to learn under the guidance of mentors by using their skills on real world applications, that is, open source projects.

Google Code-in officially starts for students on November 28, but students are encouraged to learn about the mentor organizations ahead of time and can get started by clicking on the links below.

  • Apertium - rule-based machine translation platform
  • BRL-CAD - computer graphics, 2D and 3D geometry modeling, and computer-aided design (CAD)
  • CCExtractor - open source tools for subtitle generation
  • Copyleft Games - building game development platforms for tomorrow
  • Drupal - content management platform
  • FOSSASIA - developing communities across all ages and borders to form a better future with Open Technologies and ICT
  • Haiku - operating system specifically targeting personal computing
  • KDE - team that creates Free Software for desktop and portable computing
  • MetaBrainz - builds community maintained databases
  • Mifos Initiative - transforming the delivery of financial services to the poor and the unbanked
  • MovingBlocks - like an open source Minecraft
  • OpenMRS - open source medical records system for the world
  • SCoRe - research lab that seeks sustainable solutions for problems faced by developing countries
  • Sugar Labs - learning platform and activities for elementary education
  • Systers - community for women involved in the technical aspects of computing
  • Wikimedia - non-profit foundation dedicated to bringing free content to the world, operating Wikipedia
  • Zulip - powerful, threaded open source group chat with apps for every major platform
Mentor organizations are currently creating thousands of tasks for students covering code, documentation, user interface, quality assurance, outreach, research and training. The contest officially starts for students on Monday, November 28th at 9:00am PST.
You can learn more about Google Code-in on the contest site where you’ll find Contest Rules, Frequently Asked Questions and Important Dates. There you’ll also find flyers and other helpful information including the Getting Started Guide. Our discussion mailing list is a great way to talk with other students, mentors and organization administrators about the contest. For questions about eligibility or other general questions, you can contact us at gci-support@google.com.
By Josh Simmons, Open Source Programs Office
Categories: Open Source

Cilium: Networking and security for containers with BPF and XDP

Mon, 11/07/2016 - 13:42
This is a guest post by Daniel Borkmann who was recently recognized through the Google Open Source Peer Bonus program for his work on the Cilium project. We invited Daniel to share his project on our blog.

Our open source project, called Cilium, started as an experiment for Linux container networking tackling four requirements:

  • Scale: How can we scale in terms of addressing and with regards to network policy?
  • Extensibility: Can we be as extensible as user space networking in the Linux kernel itself?
  • Simplicity: What is an appropriate abstraction away from traditional networking?
  • Performance: Do we sacrifice performance in the process of implementing the aforementioned aspects?

We realize these goals in Cilium with the help of eBPF. eBPF is an efficient and generic in-kernel bytecode engine, that allows for full programmability. There are many subsystems in the Linux kernel that utilize eBPF, mainly in the areas of networking, tracing and security.

eBPF can be attached to key ingress and egress points of the kernel's networking data path for every network device. As input, eBPF operates on the kernel's network packet representation and can thus access and mangle various kinds of data, redirect the packet to other devices, perform encapsulations, etc.

This is a typical workflow: eBPF is programmed in a subset of C, compiled with LLVM which contains an eBPF back-end. LLVM then generates an ELF file containing program code, specification for maps and related relocation data. In eBPF, maps are efficient key/value stores in the kernel that can be shared between various eBPF programs, but also between user space. Given the ELF file, tools like tc (traffic control) can parse its content and load the program into the kernel. Before the program is executed, the kernel verifies the eBPF bytecode in order to make sure that it cannot affect the kernel's stability (e.g. crash the kernel and out of bounds access) and always terminates, which requires programs to be free of loops. Once it passed verification, the program is JIT (just-in-time) compiled.

Today, architectures such as x86_64, arm64, ppc64 and s390 have the ability to compile a native opcode image out of an eBPF program, so that instead of an execution through an in-kernel eBPF interpreter, the resulting image can run natively like any other kernel code. tc then installs the program into the kernel's networking data path, and with a capable NIC, the program can also be offloaded entirely into the hardware.


Cilium acts as a middle layer, plugs into container runtimes and orchestrators such as Kubernetes, Docker or CNI, and can generate and atomically update eBPF programs on the fly without requiring a container to restart. Thus, unlike connection proxies, an update of the datapath does not cause connections to be dropped. These programs are specifically tailored and optimized for each container, for example, a feature that a particular container does not need can just be compiled out and the majority of configuration becomes constant, allowing LLVM for further optimizations.

We have many implemented building blocks in Cilium using eBPF, such as NAT64, L3/L4 load balancing with direct server return, a connection tracker, port mapping, access control, NDisc and ARP responder and integration with various encapsulations like VXLAN, Geneve and GRE, just to name a few. Since all these building blocks run in the Linux kernel and have a stable API, there is of course no need to cross kernel/user space boundary, which makes eBPF a perfectly suited and flexible technology for container networking.

One step further in that direction is XDP, which was recently merged into the Linux kernel and allows for DPDK-like performance for the kernel itself. The basic idea is that XDP is tightly coupled with eBPF and hooks into a very early ingress path at the driver layer, where it operates with direct access to the packet's DMA buffer.

This is effectively as low-level as it can get to reach near-optimal performance, which mainly allows for tailoring high-performance load balancers or routers with commodity hardware. One advantage that comes with XDP is also that it reuses the kernel's security model for accessing the device as opposed to user space based mechanisms. It doesn't require any third party modules and works in concert with the Linux kernel. Both XDP and tc with eBPF are complementary to each other, and constitute a bigger piece of the puzzle for Cilium itself.

If you’re curious, check out the Cilium code or demos on GitHub.


By Daniel Borkmann, Cilium contributor
Categories: Open Source

Podcast to YouTube: an open source story

Fri, 11/04/2016 - 18:00
Almost a year ago Mark Mandel and I started the Google Cloud Platform Podcast, a weekly podcast that covers topics related to Google Cloud Platform, among other things. It's been a pretty successful podcast, but that’s not what I want to write about today.

After a while we started receiving emails from listeners that wanted to access our podcast on YouTube. Even though this might seem strange for those that love podcasts and have their favorite app on their phones, we decided that the customer is always right: we should post every episode to YouTube.

Specifications

Ok, so … how? Well, to create a video I need to merge the mp3 audio from an episode with a static image. Let's include the title of the episode and the Google Cloud Platform Podcast logo.


But once we post the video to YouTube we're going to need more than that! We need a description, some tags, and probably a link to the episode (SEO FTW!).

Where can we get that information from? Let's think about this for a minute. Where are others getting this information from? The RSS feed! Would it be possible to create a tool to which I could say "post the video for episode 46" and a couple minutes later the video appeared on YouTube? That'd be awesome! Let's do that!

Architecture

The application I wrote parses an RSS feed and given the episodes to publish it downloads the metadata and audio for an episode, generates the corresponding videos, and pushes them to YouTube.
Diagram of the flow of data in podcast-to-youtubeThe hardest parts here are the creation of the image and the video. The rest is sending HTTP requests right and left.

Image Maker: rendering images in pure Go

After trying a couple of different tools I decided that the easiest was to create the image from scratch in Go using the image package from the standard library and a freetype library available on GitHub.

Probably the most fun part was to be able to choose a font that would make the title fit the image correctly regardless of the length in characters. I ended up creating a loop that:
  • chooses a font and measures the width of the resulting text
  • if it's too wide, decreases the font size by one and repeats.
Surprisingly, for me, this is actually a pretty common practice!

It is also worth mentioning the way I test the package: Using a standard image that I compare to the one generated by the package, then showing a "diff" image where all the pixels that differ are highlighted in red.
Diff image generated when using a wrong DPI.The code for this package is available here.

Video maker: ffmpeg is awesome

From the beginning I knew I would end up using ffmpeg to create my video. Why? Well, because it is as simple as running this command:

$ ffmpeg -i image.png -i audio.mp3 video.mp4

Easy right? Well, this is once ffmpeg has been installed and correctly configured, which is actually not that simple and would make this tool hard to install on any machine.

That's why the whole tool runs on Docker. Docker is a pretty widespread technology, and thanks to Makefile I'm able to provide a tool that can be run like this:

$ make run

Conclusion

It took me a couple of days to write the tool and get it to a point where I could open source it, but it was totally worth it. I know that others will be able to easily reuse it, or even extend it. Who knows, maybe this should be exposed as a web application so anyone can use it, no Docker or Makefile needed!

I am currently using this tool weekly to upload the Google Cloud Platform Podcast episodes to this playlist, and you can find the whole code on this GitHub repository.

Any questions? I'm @francesc on Twitter.

By Francesc Campoy, Developer Advocate

Categories: Open Source

Using TensorFlow and JupyterHub in Classrooms

Mon, 10/31/2016 - 18:00
We’ve published a new solution and a companion GitHub repository that guides you through setting up a Google Container Engine cluster to run JupyterHub to automatically provision secure Jupyter containers for each user in a classroom or team. Don’t let the title of this article mislead you, not only does it use TensorFlow and JupyterHub, it’s actually an open source and cloud smorgasbord based on the Jupyter and Kubernetes platforms.



Jupyter is a powerful open source technology that gives you a platform to write and execute code to analyze, visualize and share the discoveries you find in your big data set. You can download a number of different Docker images preconfigured with many different notebook extensions and software packages to help you on any kind of data-science quest.

If you’re exploring on your own, and really want to get started quickly, you can get this all running on your local computer, but what if you want to take your expertise and lead a classroom of people along the same path? You have to either configure everything for them or walk them through configuring their own machines with all the required software.

This is where JupyterHub comes in, as a management layer in front of Jupyter instances, allowing you to configure users, using custom authentication, and giving you a Python interface to spawn new Jupyter instances for each user. Even with JupyterHub, you still need a way to provision physical and virtual hardware for the students.

Enter Kubernetes, an open source system for automating deploying, scaling and managing containerized applications. Google Container Engine is a fully managed service based on Kubernetes, allowing you to create clusters easily on Google Cloud Platform.

This solution comes with a JupyterHub Spawner class that allows it to create Kubernetes Pods, which are Docker images running Jupyter, for each user. It also comes with all the automation scripts required to create a Container Engine cluster and let you easily customize your setup.

When your students log into JupyterHub using Google OAuth2, they can choose from a list of several pre-built Jupyter images, including a newly updated “datalab-jupyter” image, which comes with the Google Datalab open source notebook extension enabling integration with BigQuery, Google Cloud ML, StackDriver, and it also has TensorFlow and the Apache Beam Python SDK for Google Cloud DataFlow installed.  Users can also choose to run any of the pre-configured Jupyter docker-stack images, or you can build your own Docker images to run any special libraries or Jupyter configurations you want.

We hope that this solution allows you to get your classroom or team environment running quickly so you can focus on learning rather than configuring machines.

By Brad Svee, Cloud Solutions Architect
Categories: Open Source

Dart in 2017 and beyond

Wed, 10/26/2016 - 10:00
We’re here at the Dart Developer Summit in Munich, Germany. Over 250 developers from more than 50 companies from all over the world just finished watching the keynote.

This is a summary of the topics we covered:

Dart is the fastest growing programming language at Google, with a 3.5x increase in lines of code since last year. We like to think that this is because of our focus on developer productivity: teams report 25% to 100% increase in speed of development. Google has bet its biggest business on Dart — the web apps built on Dart bring over $70B per year.

Google AdSense recently launched a ground-up redesign of their web app, built with Dart. Earlier this year, we announced that the next generation of AdWords is built with Dart. There are more exciting Dart products at Google that we’re looking forward to reveal. Outside Google, companies such as Wrike, Workiva, Soundtrap, Blossom, DG Logic, Sonar Design have all been using and enjoying Dart for years.

Our five year investment in this language is reaping fruit. But we’re not finished.

We learned that people who use Dart love its terse and readable syntax. So we’re keeping that.

We have also learned that Dart developers really enjoy the language’s powerful static analysis. So we’re making it better. With strong mode, Dart’s type system becomes sound (meaning that it rejects all incorrect programs). We’re also introducing support for generic methods.

We have validated that the programming language itself is just a part of the puzzle. Dart comes with ‘batteries included.’ Developers really like Dart’s core libraries — we will keep them tight, efficient and comprehensive. We will also continue to invest in tooling such as pub (our integrated packaging system), dartfmt (our automatic formatter) and, of course, the analyzer.

On the web, we have arrived at a framework that is an excellent fit for Dart: AngularDart. All the Google web apps mentioned above use it. It has been in production at Google since February. AngularDart is designed for Dart, and it’s getting better every week. In the past 4 months, AngularDart’s output has gotten 40% smaller, and our AngularDart web apps got 15% faster.

Today, we’re launching AngularDart 2.0 final. Tune in to the next session.

With that, we’re also releasing — as a developer preview — the AngularDart components that Google uses for its major web apps. These Material Design widgets are being developed by hundreds of Google engineers and are thoroughly tested. They are written purely in Dart.

We’re also making Dart easier to use with existing JavaScript libraries. For example, you will be able to use our tool to convert TypeScript .d.ts declarations into Dart libraries.

We’re making the development cycle much faster. Thanks to Dart Dev Compiler, compilation to JavaScript will take less than a second across all modern browsers.

We believe all this makes Dart an even better choice for web development than before. Dart has been here for a long time and it’s not going anywhere. It’s cohesive and dependable, which is what a lot of web developers want.

We’re also very excited about Flutter — a project to help developers build high-performance, high-fidelity, mobile apps for iOS and Android from a single codebase in Dart. More on that tomorrow.

We hope you’ll enjoy these coming two days. Tune in on the live stream or follow #dartsummit on Twitter.

By Filip Hracek, Developer Relations Program Manager
Categories: Open Source

Google Summer of Code 2016 wrap-up: GNU Radio

Tue, 10/25/2016 - 18:00
This post is the third installment in our series of wrap-up posts reflecting on Google Summer of Code 2016. Check out the first and second posts in the series.

Originally posted on GNU Radio Blog

The summer has come to an end -- along with the Summer of Code for GNU Radio. It was a great season in terms of student participation, and as the students are preparing their last commits, this seems a good time to summarize their efforts.

All students presented their work (either in person, or via poster) at this year’s GNU Radio Conference in Boulder, Colorado.

gr-inspector

With gr-inspector, GNU Radio now has its own out-of-tree module, which serves as a repository for signal analysis algorithms, but also as a collection of fantastic examples. This module was created and worked on by Sebastian Müller, who was funded by Google Summer of Code (GSoC), and Christopher Richardson, who participated as a Summer of Code in Space (SOCIS) student funded by the European Space Agency. Sebastian also created a video demonstrating some of the features:


Both Sebastian and Chris have written up their efforts on their own blogs.

PyBOMBS GUI

Ravi Sharan was our other GSoC student, primarily working on a GUI for PyBOMBS, our installation helper tool. Ravi also worked on a bunch of other things, and has summarized his efforts as well.

The PyBOMBS GUI is written in Qt, and is a nice extension to our out-of-tree module ecosystem:


While some developers prefer the comfort of their command line environments, we hope that the PyBOMBS GUI will ease the entry for more new developers. The GUI ties in nicely with CGRAN, and with the correct setup, users can directly launch installation of out-of-tree modules from their browser.

Want to participate? Have ideas?

We will definitely apply for GSoC and SOCIS again next year! If you want to participate as a student, it helps a lot to get involved with the community early on. We also recommend you sign up for the mailing list, and get involved with GNU Radio by using it, reporting and fixing issues, or even publishing your own out-of-tree module. For more ideas, take a look at our summer of code wiki pages.

If you simply have ideas for future projects, those are welcome too! Suggest those on the mailing list, or simply edit the wiki page.

By Martin Braun, Organization Administrator for GNU Radio
Categories: Open Source

Google Summer of Code 2016 wrap-up: NRNB

Tue, 10/25/2016 - 17:37
This post is part of our series of Google Summer of Code wrap-ups, guest posts from students, mentors and organization admins reflecting on Google Summer of Code 2016. Don't miss our first post and follow along for more wrap-up posts and announcements.

We were so excited to be a part of Google Summer of Code (GSoC) again after a year off, we pulled together over 50 project ideas and dozens of eager mentors to develop open source code for network biology research. Organized as the National Resource for Network Biology (NRNB), we selected 15 proposals that brought together well-matched students, mentors and project ideas.

All 15 students passed their midterm and final evaluations, resulting in a wide range of (mostly) production-ready code, covering algorithm, UI, importer and converter development for both web and desktop for Cytoscape, cytoscape.js, SBML, SBGN, cBioPortal, Cell Designer, GraphSpace and more.

We are proud of the technical accomplishments and productivity of our students, and we are also proud of the many important aspects of diversity our students represent in the GSoC program, including geographical, gender and academic. Here are some numbers and facts about our 15 students compared to overall GSoC 2016 student stats in parentheses:
  • 9 different countries, including 1 (of 2) from Croatia, 1 (of 3) from Armenia and 2 (of 12) from Turkey
  • 20% female (compared to 12% overall)
  • 67% Computer Science (compared to 78% overall), including PhD students in Biological Oceanography and Medical Biochemistry & Biotechnology, an MS student in Bioinformatics, and a pre-med undergraduate.



Here are some quotes and blogs from our students this year. If you are considering applying as student (or mentor) next year, here is some inspiration:
“I had the opportunity to learn and practice JavaScript with a very interesting project and having a mentor available was great for getting help when needed. The program seemed extremely well run and I would strongly recommend it to anyone interested.”
“Working in an NRNB [GSoC] training program helped to strengthen my resume and introduced me to the idea of combining a career in medicine with computer-based research.”
“I love the friendly atmosphere and the way the team works together. From the very beginning I [felt] well integrated in the group. It was pure fun to work together on the same project and to see how it [has] grown over the time. I [would] recommend everybody try the NRNB training program.”
Some of our student blogs:
  • Hovakim Grabski – "Java support for Deviser, a code generation system for SBML libraries"
  • Kaito Ii – "Interconvertible Layout software for CellDesigner" 
  • Roman Schulte – "Offline SBML validation in the Java-based JSBML library"
  • Mridul Seth – "Import graphs in multiple formats and Cytoscape files into GraphSpace"

By Alex Pico and Kristina Hanspers, Organization Administrators for NRNB
Categories: Open Source