Constructing pleasant Android digicam and media experiences

Posted by Donovan McMurray, Mayuri Khinvasara Khabya, Mozart Louis, and Nevin Mital – Developer Relations Engineers

Hello Android Developers!

We are the Android Developer Relations Camera & Media team, and we’re excited to bring you something a little different today. Over the past several months, we’ve been hard at work writing sample code and building demos that showcase how to take advantage of all the great potential Android offers for building delightful user experiences.

Some of these efforts are available for you to explore now, and some you’ll see later throughout the year, but for this blog post we thought we’d share some of the learnings we gathered while going through this exercise.

Grab your favorite Android plush or rubber duck, and read on to see what we’ve been up to!

Future-proof your app with Jetpack

Nevin Mital

One of our focuses for the past several years has been improving the developer tools available for video editing on Android. This led to the creation of the Jetpack Media3 Transformer APIs, which offer solutions for both single-asset and multi-asset video editing preview and export. Today, I’d like to focus on the Composition demo app, a sample app that showcases some of the multi-asset editing experiences that Transformer enables.

I started by adding a custom video compositor to demonstrate how you can arrange input video sequences into different layouts for your final composition, such as a 2×2 grid or a picture-in-picture overlay. You can customize this by implementing a VideoCompositorSettings and overriding the getOverlaySettings method. This object can then be set when building your Composition with setVideoCompositorSettings.

Here is an example for the 2×2 grid layout:

object : VideoCompositorSettings {
…

override fun getOverlaySettings(inputId: Int, presentationTimeUs: Long): OverlaySettings {
return when (inputId) {
0 -> { // First sequence is placed in the top left
StaticOverlaySettings.Builder()
.setScale(0.5f, 0.5f)
.setOverlayFrameAnchor(0f, 0f) // Middle of overlay
.setBackgroundFrameAnchor(-0.5f, 0.5f) // Top-left section of background
.build()
}

1 -> { // Second sequence is placed in the top right
StaticOverlaySettings.Builder()
.setScale(0.5f, 0.5f)
.setOverlayFrameAnchor(0f, 0f) // Middle of overlay
.setBackgroundFrameAnchor(0.5f, 0.5f) // Top-right section of background
.build()
}

2 -> { // Third sequence is placed in the bottom left
StaticOverlaySettings.Builder()
.setScale(0.5f, 0.5f)
.setOverlayFrameAnchor(0f, 0f) // Middle of overlay
.setBackgroundFrameAnchor(-0.5f, -0.5f) // Bottom-left section of background
.build()
}

3 -> { // Fourth sequence is placed in the bottom right
StaticOverlaySettings.Builder()
.setScale(0.5f, 0.5f)
.setOverlayFrameAnchor(0f, 0f) // Middle of overlay
.setBackgroundFrameAnchor(0.5f, -0.5f) // Bottom-right section of background
.build()
}

else -> {
StaticOverlaySettings.Builder().build()
}
}
}
}

Since getOverlaySettings also provides a presentation time, we can even animate the layout, such as in this picture-in-picture example:

Next, I spent some time migrating the Composition demo app to use Jetpack Compose. With complicated editing flows, it can help to take advantage of as much screen space as is available, so I decided to use the supporting pane adaptive layout. This way, the user can fine-tune their video creation on the preview screen, and export options are only shown at the same time on a larger display. Below, you can see how the UI dynamically adapts to the screen size on a foldable device, when switching from the outer screen to the inner screen and vice versa.

What’s great is that by using Jetpack Media3 and Jetpack Compose, these features also carry over seamlessly to other devices and form factors, such as the new Android XR platform. Right out-of-the-box, I was able to run the demo app in Home Space with the 2D UI I already had. And with some small updates, I was even able to adapt the UI specifically for XR with features such as multiple panels, and to take further advantage of the extra space, an Orbiter with playback controls for the editing preview.

Orbiter(
position = OrbiterEdge.Bottom,
offset = EdgeOffset.inner(offset = MaterialTheme.spacing.standard),
alignment = Alignment.CenterHorizontally,
shape = SpatialRoundedCornerShape(CornerSize(28.dp))
) {
Row (horizontalArrangement = Arrangement.spacedBy(MaterialTheme.spacing.mini)) {
// Playback control for rewinding by 10 seconds
FilledTonalIconButton({ viewModel.seekBack(10_000L) }) {
Icon(
painter = painterResource(id = R.drawable.rewind_10),
contentDescription = “Rewind by 10 seconds”
)
}
// Playback control for play/pause
FilledTonalIconButton({ viewModel.togglePlay() }) {
Icon(
painter = painterResource(id = R.drawable.rounded_play_pause_24),
contentDescription =
if(viewModel.compositionPlayer.isPlaying) {
“Pause preview playback”
} else {
“Resume preview playback”
}
)
}
// Playback control for forwarding by 10 seconds
FilledTonalIconButton({ viewModel.seekForward(10_000L) }) {
Icon(
painter = painterResource(id = R.drawable.forward_10),
contentDescription = “Forward by 10 seconds”
)
}
}
}

Jetpack libraries unlock premium functionality incrementally

Donovan McMurray

Not only do our Jetpack libraries have you covered by working consistently across existing and future devices, but they also open the doors to advanced functionality and custom behaviors to support all types of app experiences. In a nutshell, our Jetpack libraries aim to make the common case very accessible and easy, and it has hooks for adding more custom features later.

We’ve worked with many apps who have switched to a Jetpack library, built the basics, added their critical custom features, and actually saved developer time over their estimates. Let’s take a look at CameraX and how this incremental development can supercharge your process.

// Set up CameraX app with preview and image capture.
// Note: setting the resolution selector is optional, and if not set,
// then a default 4:3 ratio will be used.
val aspectRatioStrategy = AspectRatioStrategy(
AspectRatio.RATIO_16_9, AspectRatioStrategy.FALLBACK_RULE_NONE)
var resolutionSelector = ResolutionSelector.Builder()
.setAspectRatioStrategy(aspectRatioStrategy)
.build()

private val previewUseCase = Preview.Builder()
.setResolutionSelector(resolutionSelector)
.build()
private val imageCaptureUseCase = ImageCapture.Builder()
.setResolutionSelector(resolutionSelector)
.setCaptureMode(ImageCapture.CAPTURE_MODE_MINIMIZE_LATENCY)
.build()

val useCaseGroupBuilder = UseCaseGroup.Builder()
.addUseCase(previewUseCase)
.addUseCase(imageCaptureUseCase)

cameraProvider.unbindAll()

camera = cameraProvider.bindToLifecycle(
this, // lifecycleOwner
CameraSelector.DEFAULT_BACK_CAMERA,
useCaseGroupBuilder.build(),
)

After setting up the basic structure for CameraX, you can set up a simple UI with a camera preview and a shutter button. You can use the CameraX Viewfinder composable which displays a Preview stream from a CameraX SurfaceRequest.

// Create preview
Box(
Modifier
.background(Color.Black)
.fillMaxSize(),
contentAlignment = Alignment.Center,
) {
surfaceRequest?.let {
CameraXViewfinder(
modifier = Modifier.fillMaxSize(),
implementationMode = ImplementationMode.EXTERNAL,
surfaceRequest = surfaceRequest,
)
}
Button(
onClick = onPhotoCapture,
shape = CircleShape,
colors = ButtonDefaults.buttonColors(containerColor = Color.White),
modifier = Modifier
.height(75.dp)
.width(75.dp),
)
}

fun onPhotoCapture() {
// Not shown: defining the ImageCapture.OutputFileOptions for
// your saved images
imageCaptureUseCase.takePicture(
outputOptions,
ContextCompat.getMainExecutor(context),
object : ImageCapture.OnImageSavedCallback {
override fun onError(exc: ImageCaptureException) {
val msg = “Photo capture failed.”
Toast.makeText(context, msg, Toast.LENGTH_SHORT).show()
}

override fun onImageSaved(output: ImageCapture.OutputFileResults) {
val savedUri = output.savedUri
if (savedUri != null) {
// Do something with the savedUri if needed
} else {
val msg = “Photo capture failed.”
Toast.makeText(context, msg, Toast.LENGTH_SHORT).show()
}
}
},
)
}

You’re already on track for a solid camera experience, but what if you wanted to add some extra features for your users? Adding filters and effects are easy with CameraX’s Media3 effect integration, which is one of the new features introduced in CameraX 1.4.0.

Here’s how simple it is to add a black and white filter from Media3’s built-in effects.

val media3Effect = Media3Effect(
application,
PREVIEW or IMAGE_CAPTURE,
ContextCompat.getMainExecutor(application),
{},
)
media3Effect.setEffects(listOf(RgbFilter.createGrayscaleFilter()))
useCaseGroupBuilder.addEffect(media3Effect)

The Media3Effect object takes a Context, a bitwise representation of the use case constants for targeted UseCases, an Executor, and an error listener. Then you set the list of effects you want to apply. Finally, you add the effect to the useCaseGroupBuilder we defined earlier.

(Left) Our camera app with no filter applied. (Right) Our camera app after the createGrayscaleFilter was added.

There are many other built-in effects you can add, too! See the Media3 Effect documentation for more options, like brightness, color lookup tables (LUTs), contrast, blur, and many other effects.

To take your effects to yet another level, it’s also possible to define your own effects by implementing the GlEffect interface, which acts as a factory of GlShaderPrograms. You can implement a BaseGlShaderProgram’s drawFrame() method to implement a custom effect of your own. A minimal implementation should tell your graphics library to use its shader program, bind the shader program’s vertex attributes and uniforms, and issue a drawing command.

Jetpack libraries meet you where you are and your app’s needs. Whether that be a simple, fast-to-implement, and reliable implementation, or custom functionality that helps the critical user journeys in your app stand out from the rest, Jetpack has you covered!

Jetpack offers a foundation for innovative AI Features

Mayuri Khinvasara Khabya

Just as Donovan demonstrated with CameraX for capture, Jetpack Media3 provides a reliable, customizable, and feature-rich solution for playback with ExoPlayer. The AI Samples app builds on this foundation to delight users with helpful and enriching AI-driven additions.

In today’s rapidly evolving digital landscape, users expect more from their media applications. Simply playing videos is no longer enough. Developers are constantly seeking ways to enhance user experiences and provide deeper engagement. Leveraging the power of Artificial Intelligence (AI), particularly when built upon robust media frameworks like Media3, offers exciting opportunities. Let’s take a look at some of the ways we can transform the way users interact with video content:

Empowering Video Understanding: The core idea is to use AI, specifically multimodal models like the Gemini Flash and Pro models, to analyze video content and extract meaningful information. This goes beyond simply playing a video; it’s about understanding what’s in the video and making that information readily accessible to the user.
Actionable Insights: The goal is to transform raw video into summaries, insights, and interactive experiences. This allows users to quickly grasp the content of a video and find specific information they need or learn something new!
Accessibility and Engagement: AI helps make videos more accessible by providing features like summaries, translations, and descriptions. It also aims to increase user engagement through interactive features.

A Glimpse into AI-Powered Video Journeys

The following example demonstrates potential video journies enhanced by artificial intelligence. This sample integrates several components, such as ExoPlayer and Transformer from Media3; the Firebase SDK (leveraging Vertex AI on Android); and Jetpack Compose, ViewModel, and StateFlow. The code will be available soon on Github.

(Left) Video summarization (Right) Thumbnails timestamps and HDR frame extraction

There are two experiences in particular that I’d like to highlight:

HDR Thumbnails: AI can help identify key moments in the video that could make for good thumbnails. With those timestamps, you can use the new ExperimentalFrameExtractor API from Media3 to extract HDR thumbnails from videos, providing richer visual previews.
Text-to-Speech: AI can be used to convert textual information derived from the video into spoken audio, enhancing accessibility. On Android you can also choose to play audio in different languages and dialects thus enhancing personalization for a wider audience.

Using the right AI solution

Currently, only cloud models support video inputs, so we went ahead with a cloud-based solution.Iintegrating Firebase in our sample empowers the app to:

Generate real-time, concise video summaries automatically.
Produce comprehensive content metadata, including chapter markers and relevant hashtags.
Facilitate seamless multilingual content translation.

So how do you actually interact with a video and work with Gemini to process it? First, send your video as an input parameter to your prompt:

val promptData =
“Summarize this video in the form of top 3-4 takeaways only. Write in the form of bullet points. Don’t assume if you don’t know”

val generativeModel = Firebase.vertexAI.generativeModel(“gemini-2.0-flash”)
_outputText.value = OutputTextState.Loading

viewModelScope.launch(Dispatchers.IO) {
try {
val requestContent = content {
fileData(videoSource.toString(), “video/mp4”)
text(prompt)
}
val outputStringBuilder = StringBuilder()

generativeModel.generateContentStream(requestContent).collect { response ->
outputStringBuilder.append(response.text)
_outputText.value = OutputTextState.Success(outputStringBuilder.toString())
}

_outputText.value = OutputTextState.Success(outputStringBuilder.toString())

} catch (error: Exception) {
_outputText.value = error.localizedMessage?.let { OutputTextState.Error(it) }
}
}

Notice there are two key components here:

FileData: This component integrates a video into the query.
Prompt: This asks the user what specific assistance they need from AI in relation to the provided video.

Of course, you can finetune your prompt as per your requirements and get the responses accordingly.

In conclusion, by harnessing the capabilities of Jetpack Media3 and integrating AI solutions like Gemini through Firebase, you can significantly elevate video experiences on Android. This combination enables advanced features like video summaries, enriched metadata, and seamless multilingual translations, ultimately enhancing accessibility and engagement for users. As these technologies continue to evolve, the potential for creating even more dynamic and intelligent video applications is vast.

Go above-and-beyond with specialized APIs

Mozart Louis

Android 16 introduces the new audio PCM Offload mode which can reduce the power consumption of audio playback in your app, leading to longer playback time and increased user engagement. Eliminating the power anxiety greatly enhances the user experience.

Oboe is Android’s premiere audio api that developers are able to use to create high performance, low latency audio apps. A new feature is being added to the Android NDK and Android 16 called Native PCM Offload playback.

Offload playback helps save battery life when playing audio. It works by sending a large chunk of audio to a special part of the device’s hardware (a DSP). This allows the CPU of the device to go into a low-power state while the DSP handles playing the sound. This works with uncompressed audio (like PCM) and compressed audio (like MP3 or AAC), where the DSP also takes care of decoding.

This can result in significant power saving while playing back audio and is perfect for applications that play audio in the background or while the screen is off (think audiobooks, podcasts, music etc).

We created the sample app PowerPlay to demonstrate how to implement these features using the latest NDK version, C++ and Jetpack Compose.

Here are the most important parts!

First order of business is to assure the device supports audio offload of the file attributes you need. In the example below, we are checking if the device support audio offload of stereo, float PCM file with a sample rate of 48000Hz.

val format = AudioFormat.Builder()
.setEncoding(AudioFormat.ENCODING_PCM_FLOAT)
.setSampleRate(48000)
.setChannelMask(AudioFormat.CHANNEL_OUT_STEREO)
.build()

val attributes =
AudioAttributes.Builder()
.setContentType(AudioAttributes.CONTENT_TYPE_MUSIC)
.setUsage(AudioAttributes.USAGE_MEDIA)
.build()

val isOffloadSupported =
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.Q) {
AudioManager.isOffloadedPlaybackSupported(format, attributes)
} else {
false
}

if (isOffloadSupported) {
player.initializeAudio(PerformanceMode::POWER_SAVING_OFFLOADED)
}

Once we know the device supports audio offload, we can confidently set the Oboe audio streams’ performance mode to the new performance mode option, PerformanceMode::POWER_SAVING_OFFLOADED.

// Create an audio stream
AudioStreamBuilder builder;
builder.setChannelCount(mChannelCount);
builder.setDataCallback(mDataCallback);
builder.setFormat(AudioFormat::Float);
builder.setSampleRate(48000);

builder.setErrorCallback(mErrorCallback);
builder.setPresentationCallback(mPresentationCallback);
builder.setPerformanceMode(PerformanceMode::POWER_SAVING_OFFLOADED);
builder.setFramesPerDataCallback(128);
builder.setSharingMode(SharingMode::Exclusive);
builder.setSampleRateConversionQuality(SampleRateConversionQuality::Medium);
Result result = builder.openStream(mAudioStream);

Now when audio is played back, it will be offloading audio to the DSP, helping save power when playing back audio.

There is more to this feature that will be covered in a future blog post, fully detailing out all of the new available APIs that will help you optimize your audio playback experience!

What’s next

Of course, we were only able to share the tip of the iceberg with you here, so to dive deeper into the samples, check out the following links:

Hopefully these examples have inspired you to explore what new and fascinating experiences you can build on Android. Tune in to our session at Google I/O in a couple weeks to learn even more about use-cases supported by solutions like Jetpack CameraX and Jetpack Media3!

Source link