2-2-30 快速掌握Kotlin-组合使用map与filter

29 阅读3分钟

Kotlin 语言中组合使用 mapfilter

mapfilter 是 Kotlin 中最常用的两个集合操作函数,组合使用它们可以创建强大且表达力强的数据处理管道。

1. 基础组合使用

基本模式

// 先过滤再映射(更高效,因为减少了映射的次数)
val numbers = listOf(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

val result = numbers
    .filter { it % 2 == 0 }      // 先过滤出偶数: [2, 4, 6, 8, 10]
    .map { it * it }             // 再计算平方: [4, 16, 36, 64, 100]

// 先映射再过滤(有时需要先转换再判断)
val anotherResult = numbers
    .map { it * 2 }              // 先加倍: [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
    .filter { it > 10 }          // 再过滤大于10的: [12, 14, 16, 18, 20]

2. 使用 filter 变体组合

filterIndexed + map

val names = listOf("Alice", "Bob", "Charlie", "David", "Eve")

// 过滤索引为偶数的元素并转换为大写
val result = names
    .filterIndexed { index, _ -> index % 2 == 0 }  // 索引0,2,4: ["Alice", "Charlie", "Eve"]
    .map { it.uppercase() }                        // ["ALICE", "CHARLIE", "EVE"]

// 复杂过滤和映射
val processed = names
    .filterIndexed { index, name ->
        index > 0 && name.length > 3  // 排除第一个且长度大于3
    }
    .mapIndexed { index, name ->
        "${index + 1}. $name"          // 重新编号
    }
// 结果: ["1. Charlie", "2. David", "3. Eve"]

filterNotNull + map

val mixedList: List<Int?> = listOf(1, null, 2, null, 3, 4, null, 5)

// 过滤null并映射
val result = mixedList
    .filterNotNull()                    // [1, 2, 3, 4, 5]
    .map { it * 10 }                    // [10, 20, 30, 40, 50]

// 复杂场景:过滤并转换嵌套的可空值
data class User(val id: Int, val name: String?, val email: String?)

val users = listOf(
    User(1, "Alice", "alice@example.com"),
    User(2, null, "bob@example.com"),
    User(3, "Charlie", null),
    User(4, "David", "david@example.com")
)

val validUsers = users
    .filter { it.name != null && it.email != null }
    .map { it.copy(name = it.name!!.uppercase()) }
// 结果: [User(1, "ALICE", "alice@example.com"), User(4, "DAVID", "david@example.com")]

filterIsInstance + map

val mixed: List<Any> = listOf(1, "hello", 2.5, "world", 3, "kotlin", 4.7)

// 过滤出字符串并映射
val strings = mixed
    .filterIsInstance<String>()    // ["hello", "world", "kotlin"]
    .map { it.uppercase() }        // ["HELLO", "WORLD", "KOTLIN"]

// 过滤出数字并计算
val numbers = mixed
    .filterIsInstance<Int>()       // [1, 3]
    .map { it * 2 }                // [2, 6]

val doubles = mixed
    .filterIsInstance<Double>()    // [2.5, 4.7]
    .map { it.toInt() }            // [2, 4]

3. 使用 mapNotNull 简化

filter + map vs mapNotNull

val strings = listOf("1", "2", "3", "four", "5", "six")

// 方式1: filter + map(两步)
val result1 = strings
    .filter { it.toIntOrNull() != null }  // ["1", "2", "3", "5"]
    .map { it.toInt() }                   // [1, 2, 3, 5]

// 方式2: mapNotNull(一步,更简洁)
val result2 = strings.mapNotNull { it.toIntOrNull() }  // [1, 2, 3, 5]

// 复杂转换场景
data class ApiResponse(val data: Map<String, Any>?)

val responses = listOf(
    ApiResponse(mapOf("id" to 1, "name" to "Alice")),
    ApiResponse(null),
    ApiResponse(mapOf("id" to "2", "name" to "Bob")),  // id是字符串
    ApiResponse(mapOf("id" to 3, "name" to "Charlie"))
)

val validIds = responses.mapNotNull { response ->
    // 安全地提取和转换id
    (response.data?.get("id") as? Int)?.let { id ->
        Pair(id, response.data["name"] as? String ?: "Unknown")
    }
}
// 结果: [(1, "Alice"), (3, "Charlie")]

4. 链式操作的性能优化

使用 Sequence 进行惰性计算

val largeList = (1..1_000_000).toList()

// 方式1: 使用 List(立即计算)
val result1 = largeList
    .filter { it % 2 == 0 }     // 创建中间集合: 500,000 个元素
    .map { it * 2 }             // 创建中间集合: 500,000 个元素
    .take(10)                   // 只取前10个

// 方式2: 使用 Sequence(惰性计算,更高效)
val result2 = largeList.asSequence()
    .filter { it % 2 == 0 }     // 惰性过滤
    .map { it * 2 }             // 惰性映射
    .take(10)                   // 只计算10个元素
    .toList()                   // 最后转换为List

// 性能对比(单位:毫秒)
fun measureTime(block: () -> Unit): Long {
    val start = System.currentTimeMillis()
    block()
    return System.currentTimeMillis() - start
}

val time1 = measureTime {
    largeList.filter { it % 2 == 0 }.map { it * 2 }.take(10)
}

val time2 = measureTime {
    largeList.asSequence().filter { it % 2 == 0 }.map { it * 2 }.take(10).toList()
}

println("List 时间: ${time1}ms")   // 通常较慢,因为处理所有元素
println("Sequence 时间: ${time2}ms") // 通常较快,因为只处理需要的元素

减少中间集合

// 不好:创建多个中间集合
val bad = listOf(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
    .filter { it % 2 == 0 }     // 中间集合: [2, 4, 6, 8, 10]
    .map { it * 3 }             // 中间集合: [6, 12, 18, 24, 30]
    .filter { it > 15 }         // 中间集合: [18, 24, 30]
    .map { it / 2 }             // 中间集合: [9, 12, 15]

// 好:合并操作或使用 Sequence
val good1 = listOf(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).asSequence()
    .filter { it % 2 == 0 }
    .map { it * 3 }
    .filter { it > 15 }
    .map { it / 2 }
    .toList()

// 更好:合并过滤条件
val good2 = listOf(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
    .filter { it % 2 == 0 && it * 3 > 15 }  // 合并过滤条件
    .map { (it * 3) / 2 }                   // 合并映射操作

5. 实际应用场景

数据清洗和转换

data class RawData(
    val id: String?,
    val value: String?,
    val timestamp: String?
)

val rawDataList = listOf(
    RawData("1", "100.5", "2023-01-01"),
    RawData("2", "invalid", "2023-01-02"),
    RawData(null, "200.0", "2023-01-03"),
    RawData("4", "150.75", "invalid"),
    RawData("5", "300.0", "2023-01-05")
)

// 数据清洗管道
val cleanData = rawDataList
    .filterNotNull()  // 过滤掉整个对象为null的情况
    .filter { data ->
        // 验证所有字段
        data.id != null &&
        data.value?.toDoubleOrNull() != null &&
        data.timestamp?.matches(Regex("\\d{4}-\\d{2}-\\d{2}")) == true
    }
    .map { data ->
        // 转换为领域对象
        CleanData(
            id = data.id!!.toInt(),
            value = data.value!!.toDouble(),
            timestamp = LocalDate.parse(data.timestamp)
        )
    }
    .sortedBy { it.value }  // 按值排序

println("有效数据: ${cleanData.size} 条")
cleanData.forEach { println(it) }

API 响应处理

// 模拟 API 响应
data class ApiResponse<T>(
    val success: Boolean,
    val data: T?,
    val error: String? = null
)

data class UserDto(
    val id: Int,
    val name: String,
    val email: String?,
    val isActive: Boolean?
)

data class User(
    val id: Int,
    val name: String,
    val email: String,
    val status: String
)

// 处理多个 API 响应
val apiResponses = listOf(
    ApiResponse(
        success = true,
        data = listOf(
            UserDto(1, "Alice", "alice@example.com", true),
            UserDto(2, "Bob", null, true),
            UserDto(3, "Charlie", "charlie@example.com", false)
        )
    ),
    ApiResponse(
        success = false,
        data = null,
        error = "网络错误"
    ),
    ApiResponse(
        success = true,
        data = listOf(
            UserDto(4, "David", "david@example.com", true),
            UserDto(5, "Eve", "eve@example.com", null)
        )
    )
)

// 处理响应的管道
val allUsers = apiResponses
    .filter { it.success }                    // 只处理成功的响应
    .mapNotNull { it.data }                   // 提取数据,过滤null
    .flatMap { it }                           // 展平所有用户列表
    .filter { userDto ->                      // 过滤有效用户
        userDto.email != null && 
        userDto.isActive == true
    }
    .map { userDto ->                         // 转换为领域模型
        User(
            id = userDto.id,
            name = userDto.name,
            email = userDto.email!!,
            status = if (userDto.isActive!!) "活跃" else "非活跃"
        )
    }
    .sortedBy { it.name }                     // 按名称排序

println("活跃用户: ${allUsers.size} 人")
allUsers.forEach { println("- ${it.name} (${it.email})") }

财务数据处理

data class Transaction(
    val id: Int,
    val amount: Double,
    val type: String,  // "INCOME" 或 "EXPENSE"
    val category: String,
    val date: LocalDate
)

val transactions = listOf(
    Transaction(1, 1000.0, "INCOME", "工资", LocalDate.of(2023, 1, 1)),
    Transaction(2, -500.0, "EXPENSE", "房租", LocalDate.of(2023, 1, 2)),
    Transaction(3, -200.0, "EXPENSE", "餐饮", LocalDate.of(2023, 1, 3)),
    Transaction(4, 300.0, "INCOME", "兼职", LocalDate.of(2023, 1, 4)),
    Transaction(5, -100.0, "EXPENSE", "交通", LocalDate.of(2023, 1, 5))
)

// 月度财务报告
val monthlyReport = transactions
    .filter { it.date.month == Month.JANUARY }          // 只处理一月数据
    .groupBy { it.category }                            // 按类别分组
    .mapValues { (_, categoryTransactions) ->
        mapOf(
            "总收入" to categoryTransactions
                .filter { it.type == "INCOME" }
                .sumOf { it.amount },
            "总支出" to categoryTransactions
                .filter { it.type == "EXPENSE" }
                .sumOf { it.amount },
            "交易数" to categoryTransactions.size,
            "平均金额" to categoryTransactions
                .map { it.amount }
                .average()
        )
    }
    .filter { (_, stats) ->                             // 过滤掉没有数据的类别
        stats["总收入"] != 0.0 || stats["总支出"] != 0.0
    }

// 输出报告
monthlyReport.forEach { (category, stats) ->
    println("类别: $category")
    stats.forEach { (key, value) ->
        println("  $key: $value")
    }
}

6. 复杂条件过滤和映射

使用 filtermap 进行复杂转换

data class Product(
    val id: Int,
    val name: String,
    val price: Double,
    val category: String,
    val inStock: Boolean,
    val tags: List<String>
)

val products = listOf(
    Product(1, "Laptop", 999.99, "Electronics", true, listOf("tech", "portable")),
    Product(2, "Phone", 699.99, "Electronics", false, listOf("tech", "mobile")),
    Product(3, "Coffee", 4.99, "Food", true, listOf("beverage", "breakfast")),
    Product(4, "Book", 19.99, "Education", true, listOf("learning", "entertainment")),
    Product(5, "Headphones", 199.99, "Electronics", true, listOf("tech", "audio"))
)

// 复杂的查询和转换
val electronicsOnSale = products
    .filter { product ->
        // 多个过滤条件
        product.category == "Electronics" &&
        product.inStock &&
        product.price < 500.0 &&  // 价格低于500
        product.tags.contains("tech")
    }
    .map { product ->
        // 计算折扣价格
        val discount = when {
            product.price > 300 -> 0.15  // 15%折扣
            product.price > 100 -> 0.10  // 10%折扣
            else -> 0.05                // 5%折扣
        }
        
        mapOf(
            "id" to product.id,
            "name" to product.name,
            "originalPrice" to product.price,
            "discount" to discount,
            "finalPrice" to product.price * (1 - discount),
            "tags" to product.tags
        )
    }
    .sortedByDescending { it["finalPrice"] as Double }

println("特价电子产品:")
electronicsOnSale.forEach { product ->
    println("${product["name"]}: ${product["originalPrice"]} -> ${product["finalPrice"]}")
}

使用 partition 分隔数据

val numbers = listOf(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

// 使用 partition 分隔数据
val (even, odd) = numbers.partition { it % 2 == 0 }

// 对分隔后的数据进行不同处理
val processedEven = even.map { it * 2 }      // [4, 8, 12, 16, 20]
val processedOdd = odd.map { it * 3 }        // [3, 9, 15, 21, 27]

// 复杂分区场景
data class Order(
    val id: Int,
    val amount: Double,
    val status: String,  // "PENDING", "COMPLETED", "CANCELLED"
    val customerType: String  // "REGULAR", "VIP"
)

val orders = listOf(
    Order(1, 100.0, "COMPLETED", "REGULAR"),
    Order(2, 200.0, "PENDING", "VIP"),
    Order(3, 150.0, "CANCELLED", "REGULAR"),
    Order(4, 300.0, "COMPLETED", "VIP"),
    Order(5, 50.0, "PENDING", "REGULAR")
)

// 多重分区
val (completed, pendingOrCancelled) = orders.partition { it.status == "COMPLETED" }
val (vipOrders, regularOrders) = completed.partition { it.customerType == "VIP" }

// 计算统计数据
val vipStats = vipOrders
    .map { it.amount }
    .let { amounts ->
        mapOf(
            "count" to amounts.size,
            "total" to amounts.sum(),
            "average" to amounts.average(),
            "max" to amounts.maxOrNull()
        )
    }

val regularStats = regularOrders
    .filter { it.amount > 100 }  // 只统计大于100的订单
    .map { it.amount }
    .let { amounts ->
        mapOf(
            "count" to amounts.size,
            "total" to amounts.sum()
        )
    }

println("VIP 订单统计: $vipStats")
println("普通订单统计: $regularStats")

7. 使用 flatMap 组合

filter + flatMap

data class Department(
    val name: String,
    val employees: List<Employee>
)

data class Employee(
    val id: Int,
    val name: String,
    val salary: Double,
    val skills: List<String>
)

val departments = listOf(
    Department("Engineering", listOf(
        Employee(1, "Alice", 80000.0, listOf("Kotlin", "Java")),
        Employee(2, "Bob", 75000.0, listOf("Python", "JavaScript")),
        Employee(3, "Charlie", 90000.0, listOf("Kotlin", "Go"))
    )),
    Department("Sales", listOf(
        Employee(4, "David", 60000.0, listOf("Communication", "Negotiation")),
        Employee(5, "Eve", 65000.0, listOf("Kotlin", "Communication"))
    ))
)

// 查找所有会 Kotlin 的高薪员工
val kotlinExperts = departments
    .flatMap { it.employees }                     // 展平所有员工
    .filter { employee ->                         // 过滤条件
        employee.skills.contains("Kotlin") &&
        employee.salary > 70000.0
    }
    .map { employee ->                           // 映射为所需格式
        mapOf(
            "name" to employee.name,
            "salary" to employee.salary,
            "department" to departments
                .first { dept -> dept.employees.contains(employee) }
                .name
        )
    }
    .sortedByDescending { it["salary"] as Double }

println("Kotlin 专家:")
kotlinExperts.forEach { expert ->
    println("${expert["name"]} - ${expert["department"]} - \$${expert["salary"]}")
}

8. 自定义扩展函数简化组合操作

创建自定义的组合操作

// 过滤并映射的通用扩展
inline fun <T, R> Iterable<T>.filterMap(
    predicate: (T) -> Boolean,
    transform: (T) -> R
): List<R> {
    return this.filter(predicate).map(transform)
}

// 带索引的过滤映射
inline fun <T, R> Iterable<T>.filterMapIndexed(
    predicate: (index: Int, T) -> Boolean,
    transform: (index: Int, T) -> R
): List<R> {
    return this.mapIndexedNotNull { index, value ->
        if (predicate(index, value)) {
            transform(index, value)
        } else {
            null
        }
    }
}

// 安全转换扩展
inline fun <T, R : Any> Iterable<T>.safeMap(
    transform: (T) -> R?
): List<R> {
    return this.mapNotNull(transform)
}

// 使用示例
val numbers = listOf(1, 2, 3, 4, 5, 6)

val squaresOfEvens = numbers.filterMap(
    predicate = { it % 2 == 0 },
    transform = { it * it }
)  // [4, 16, 36]

val strings = listOf("1", "2", "three", "4", "five")
val parsedNumbers = strings.safeMap { it.toIntOrNull() }  // [1, 2, 4]

9. 性能最佳实践

避免重复计算

// 不好:重复计算
val bad = listOf(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
    .filter { it % 2 == 0 }
    .map { it * it }
    .filter { it > 20 }        // 重复计算 it * it

// 好:一次性计算
val good = listOf(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
    .map { it to it * it }     // 一次性计算平方
    .filter { (original, square) ->
        original % 2 == 0 && square > 20
    }
    .map { it.second }         // 只提取平方值

// 更好:使用局部变量或 Sequence
val better = listOf(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).asSequence()
    .filter { it % 2 == 0 }
    .map { it * it }
    .filter { it > 20 }
    .toList()

根据数据量选择策略

fun processLargeDataset(data: List<Int>): List<String> {
    return when {
        data.size < 1000 -> {
            // 小数据集:使用常规链式操作
            data
                .filter { it > 0 }
                .map { it.toString() }
        }
        data.size < 10000 -> {
            // 中等数据集:使用 Sequence
            data.asSequence()
                .filter { it > 0 }
                .map { it.toString() }
                .toList()
        }
        else -> {
            // 大数据集:并行处理
            data.parallelStream()
                .filter { it > 0 }
                .map { it.toString() }
                .toList()
        }
    }
}

总结

mapfilter 的组合使用是 Kotlin 函数式编程的核心:

最佳实践:

  1. 顺序很重要:通常先 filtermap 更高效
  2. 使用 Sequence:处理大数据集时使用惰性计算
  3. 合并操作:尽可能合并相邻的 filtermap 操作
  4. 使用 mapNotNull:替代 filter + map 处理可空转换
  5. 考虑 partition:当需要分隔数据时使用
  6. 避免重复计算:缓存中间结果或使用局部变量

性能提示:

  • 小数据集(< 1000):使用常规链式操作
  • 中等数据集(1000-10000):考虑使用 Sequence
  • 大数据集(> 10000):使用 Sequence 或并行流

代码可读性:

  • 保持链式操作简短(通常不超过 3-4 个步骤)
  • 为复杂操作创建自定义扩展函数
  • 使用有意义的变量名和注释

通过合理组合 mapfilter,可以编写出既高效又易于理解的代码。